Statistics and Risk Management Regression Video URL:

advertisement
Statistics and Risk
Management
Regression
Video URL:
jukebox.esc13.net/untdeveloper/Videos/Regression.mov
Vocabulary List:
Variable: an attribute or characteristic of a statistical unit that differs among a
population of units (height, age, location, etc.)
Independent Variable: the variable that changes or can be controlled; affects
the dependent variable.
Dependent Variable: the variable that is measured to determine the effect of
an independent variable.
Correlation: the measurement of a relationship between two variables.
Regression: the technique of predictive relationships based upon
correlational data.
Scatter chart: a graph representing data points (pairs of variables) charted
along the x and y axes.
Linear Regression: the technique of fitting a straight line (regression line) to
the data points on a scatter chart to determine the relationship between two
variables.
Regression Line: a line drawn through the data points on a scatter chart
showing the relationship between the variables. (Easton and McColl, 1997)
Easton, V.J., and McColl, J.H. (1997). Statistics glossary. Available from
http://www.stats.gla.ac.uk/steps/glossary/index.html
Copyright © Texas Education Agency, 2012. All rights reserved.
1
Resources:
Introduction to Linear Regression and Correlation Analysis
Use this link to: calculate and interpret the simple correlation between
two variables, determine whether the correlation is significant, calculate
and interpret the simple linear regression equation for a set of data, and
understand the assumptions behind regression analysis.
www.fordham.edu/economics/vinod/correl-regr.ppt
Regression and Correlation Analysis
This site analyzes the concepts of regression and correlation, discusses the
regression model, explains the least squares method, and defines the relationship
between correlation and regression analysis.
http://abyss.uoregon.edu/~js/glossary/correlation.html
Regression
Through this interactive Regression lesson instructors can set up an online lesson
that correlates with their text books. The learner tab explains regression, the activity
tab contains the actual activity, the help tab provides assistance on using the
activity, and the instructor tab allows teachers to set up the activity to fit with their
lecture concepts.
http://www.shodor.org/interactivate/activities/Regression/
Regression Line Example
This video from Khan Academy provides a detailed example of how to calculate
and for a regression line. This interactive lesson provides supplemental instruction
to accompany a teacher’s lecture on regression.
http://www.khanacademy.org/math/statistics/v/regression-line-example
Copyright © Texas Education Agency, 2012. All rights reserved.
2
Regression Practice Test
Name:_____________________
TRUE and FALSE: 1. The correlation demonstrating the relationship between 2 sets of data is not a calculated value. A. True B. False 2. Correlation and Regression Analysis are related in that they both deal with relationships among variables. A. True B. False 3. The correlation coefficient is a measure of linear association with only one variable. A. True B. False 4. Regression and correlation analysis should be interpreted as establishing a cause‐and‐effect relationship. A. True B. False 5. A correlation coefficient of ‐1 indicates 2 variables are related in a negative linear sense. A. True B. False 6. Linear regression consists of finding the best‐fitting straight line through the points. A. True B. False MATCHING: A.
B.
C.
D.
E.
Linear Regression Criterion Variable Simple Regression Regression Analysis Least Squares Method 7. __________ When there is only one predictor variable, we use this prediction method. 8. __________ The prediction of scores on one variable from the scores on a second variable. 9. __________ Identifies the relationship between a dependent variable and one or more independent variables. 10. __________ Most widely used procedure for developing estimates of the model parameters. 11. __________ The variable we are predicting. MULTIPLE CHOICE: 12. ____ is the measurement of a relationship between two variables. A. Regression B. Variance C. Correlation D. Deviation 13. ____ is the technique of predictive relationships based upon correlational data. A. Regression B. Variance C. Correlation D. Deviation 14. Values of the correlation coefficient are always between ____ and ____. A. ‐1 and 1 B. ‐2 and 2 C. ‐3 and 3 D. ‐4 and 4
Copyright © Texas Education Agency, 2012. All rights reserved.
3
Regression Practice Test
Name:_____________________
TRUE and FALSE: 15. A correlation coefficient of ____ indicates 2 variables are related in a positive linear sense. E. ‐1 F. 0 G. +1 H. +2 16. A correlation coefficient of ____ indicates that there is no linear relationship between the 2 variables. 17.
18.
19.
20.
E. ‐1 F. 0 G. +1 H. +2 The best‐fitting line in a linear regression is called the ____. E. Line of Correlation F. Regression Line G. Line of Best Fit H. None of the above If a point is much higher than the regression line, it will have a _________ error of prediction. E. Small F. Positive G. Large H. Negative The Excel formula to find correlation is _______. E. =CORREL(Factor1,Factor2) F. =CALC(Factor1,Factor2) G. CORREL(Factor1,Factor2) H. CALC(Factor1,Factor2) _______ are simply deviations from the mean. E. Outliers F. Raw Scores G. Deviation Scores H. None of the Above Copyright © Texas Education Agency, 2012. All rights reserved.
4
Regression Practice Test KEY
1. B
2. A
3. B
4. B
5. A
6. A
7. C
8. A
9. D
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
E
B
C
A
A
C
B
B
C
A
C
Copyright © Texas Education Agency, 2012. All rights reserved.
5
Student Assignment
7.1a Regression – Linear Correlation
Name:_____________________
You noticed that the average points scored in your high school football
conference games seem to correlate with the average night time
temperature for the football season for the last 10 years.
Temp
#1
#2
#3
#4
#5
Score
67.5
55.0
58
60
59
12.6
9.8
10.1
10.0
9.8
Temp
Score
#6
#7
#8
#9
#10
66.6
60.4
55.8
59.6
62.5
13.6
11.3
11.4
9.8
10.1
Is that a correlation or an inverse correlation? Do you think it is a strong
correlation? Why?
Copyright © Texas Education Agency, 2012. All rights reserved.
6
Student Assignment
7.2a Regression – Linear Regression
Name:_____________________
You are examining the safety record at a plant for an insurance
company. You have looked at the accidents per ten thousand
hours figure and are trying to identify a correlation with the
overtime worked per ten thousand hours.
OtHr
#1
#2
#3
#4
#5
1000
900
800
700
600
Accidents
2.5
2.6
1.9
1.95
1.85
OtHr
#6
#7
#8
#9
#10
Accidents
500
400
300
200
100
1.5
1.4
1.5
1.2
.9
They have a new government contract and expect 1500 of
overtime for several periods during the years. Will there be a
significant increased risk of more accidents during those
periods?
Copyright © Texas Education Agency, 2012. All rights reserved.
7
Explore Activity:
Height vs. Arm span – Many body measures are related to each other, including your
height and arm span. For many people, these two measures are the same. To explore
this relationship you will need the following:
 tape measure (inches or centimeters)
 computer with statistics software or a spreadsheet (or a graphing calculator with
stat functions)
 several people you can measure (like your classmates!)
Measure your height and arm span (you will probably need to work with a partner) and
record these two measures in a central location to share with your class. Once
everyone in your class has done this you will have a set of data (in pairs) that can be
plotted on a scatterplot (you can do this with a computer or even with graph paper and
pencil). Plot the points (arm span, height) so arm span will be on the horizontal (x) axis
and height on the vertical (y) axis. Note that you do not have to start the two axes at
(0, 0). Rather start each at a convenient value slightly below the smallest arm span
and smallest height.
1. How would you describe the relationship between arm span and height?
2. Are there any unusual cases? Describe them.
A least-squares linear regression equation can be produced to represent the
relationship between arm span and height. Your software (or graphing calculator) may
have built-in command to produce this equation:
height  a  b  arm span 
where a = the y-intercept (height intercept) and b = the slope of the line (increase in
height for 1 unit increase in arm span). You can also calculate this equation from the
means and standard deviations of height and arm span:
br
sy
sx
a  y  bx
Copyright © Texas Education Agency, 2012. All rights reserved.
8
where r = correlation, s y  standard deviation of height, sx  stand. dev. of arm span,
y  mean height, and x  mean arm span. All of these can be calculated using
formulas from earlier lessons.
Graph the regression equation on your scatterplot. Notice it goes “through the middle”
of your plotted points. You could now use the equation to make predictions of heights if
you know a person’s armspan.
1. Use your regression equation to calculate a person’s height if you know the arm
span is 64 inches.
2. Use your regression equation to calculate a person’s height if you know the arm
span is 72 inches.
Other similar relationships could be explored:
 Height vs. femur (thigh bone) length
 Height vs. fore arm (elbow to wrist)
 Height vs. vertical length of skull
Video Links – Check out the relevant links at Khan Academy for more information and
examples:
http://www.khanacademy.org/#statistics
Copyright © Texas Education Agency, 2012. All rights reserved.
9
Download