Linear regression and calibration curves Chemistry 321, Summer 2014

advertisement
Linear regression
and calibration curves
Chemistry 321, Summer 2014
In the next few labs, you will be generating
calibration curves – generally testing the linear
response of an instrument by measuring some
property of known concentration standards
You will record
data that perhaps
looks like this:
[A] (mM) Absorbance
0
0
0.1
0.058
0.2
0.122
0.4
0.223
0.8
0.433
Linear regression answers the question:
Which line has the best “fit” to the data?
?
?
?
17.3
Linear Regression Analysis
• Regression analysis is used to predict the value of one
variable (the dependent variable) on the basis of other
variables (the independent variables).
• Dependent variable: denoted Y
• Independent variables: denoted X1, X2, …, Xk
•
If we only have ONE independent variable, the model is
• which is referred to as simple linear regression. We
would be interested in estimating β0 and β1 from the
data we collect. Note that β0 represents the y-intercept
of the “best-fit” line and β1is the slope of that line
17.4
Least Squares Line
these differences are
called residuals or
errors
17.5
Least Squares Line…[sure glad we have computers now]
• The coefficients b1 and b0 for
the least squares line…
• …are calculated as:
17.6
Least Squares Line… See if you can estimate Y-intercept and slope from this data
Statistics
Data
Information
Data Points:
x
y
1
6
2
1
3
9
4
5
5
17
6
12
y = .934 + 2.114x
17.7
Coefficient of Determination…
• Tests thus far have shown if a linear relationship exists;
it is also useful to measure the strength of the
relationship. This is done by calculating the coefficient
of determination – R2.
• The coefficient of determination is the square of the
coefficient of correlation (r), hence R2 = (r)2
• r will be computed shortly and this is true for models
with only 1 independent variable
17.8
Coefficient of Determination
• Unlike the value of a test statistic, the coefficient
of determination does not have a critical value
that enables us to draw conclusions.
• In general the higher the value of R2, the better
the model fits the data.
• R2 = 1: Perfect match between the line and the
data points.
• R2 = 0: There are no linear relationship between x
and y.
17.9
When can you use linear regression?
• Linear regression requires us to satisfy three assumptions
about the distributions of the two quantitative variables:
• No outliers
• A (expected) linear relationship between the variables
• Equal variance of the residuals across predicted values
• (weaker) At least ten data points
• The evaluation of the conformity of the analysis to these
assumptions is generally based upon visual analysis of the
scatter plot of the dependent variable by the independent
variable. (Big hint: On Excel, use “scatter”, not “line” graphs).
7/12/2016
Slide 10
Calibration curve for the spectrophotometer
0.5
0.45
Absrobance
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
0
0.2
0.4
0.6
0.8
1
[A] mM
So go ahead and plot your points and label your axes as usual.
(These are the data from the second slide).
Calibration curve for the spectrophotometer
0.5
y = 0.5375x + 0.006
R² = 0.99893
0.45
Absrobance
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
0
0.2
0.4
0.6
0.8
1
[A] mM
Using whatever software (don’t do it by hand – I used Excel for
this), display both the equation of the best-fit line and the
coefficient of determination (R2).
Calibration curve for the spectrophotometer
0.5
y = 0.5375x + 0.006
R² = 0.99893
0.45
Absrobance
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
0
0.2
0.4
0.6
0.8
1
[A] mM
So you can use either the equation or the graph itself to determine
the concentration of your sample from its absorbance.
Summary
• Linear regression provides additional statistical information
about the relationship between two quantitative variables.
• The coefficient of determination, R², which indicates the
percentage of variance in the dependent variable that is
accounted by variability in the independent variable
• The regression equation is the formula for the trend or fit line
which enables us to predict the dependent variable for any
given value of the independent variable
• The regression equation has two parts – the intercept and the
slope
7/12/2016
Slide 14
Download