mean, standard deviation and percent error

SSAC2005:QD451.WT1.1
Calibrating a Pipettor
•
How accurate is your pipetting?
•
What are the sources of error?
•
What does it mean to calibrate a
piece of equipment, and how do
you do it?
Core Quantitative Issue
Variability: Precision vs. accuracy
Supporting Quantitative Concepts
Data Analysis: Mean, standard deviation
Visualizing data: bar and scatter graphs
Relative error; percent error
Size: Mass vs. volume
Prepared for SSAC by
Bill Thomas
Colby-Sawyer College, New London NH
© The Washington Center for Improving the Quality of Undergraduate Education. All rights reserved. 2005
1
Overview of Module
Every measurement, no matter
how carefully done, has some
error associated with it. Is the
error great or small? Knowing the
answer is important, for it tells us
how much confidence we can
have in the values obtained, and
that knowledge shapes decisions
that we make.
Every measurement can be done
more carefully, but to do so, we
have to know the source of the
error. Is it due more to technique
or to limitations of the equipment?
The answer helps us to improve
our results. It would be ineffective
to concentrate upon technique if
the problem were really a faulty
tool!
Because error is so common in
measurement, we have developed general
approaches, first to minimize error, and
then to assess how much remains.
Slide 3 introduces the concept of equipment
calibration.
Slide 4 explains mass-to-volume conversion.
Slide 5 discusses accuracy and reproducibility;
standard deviation; and percent error. All are
tools that help us measure the quality of what
we think we know.
Slides 6-8 spell out the problem and the
approach.
Slides 9-12 develop spreadsheets to treat
model data.
Slides 13-15 consider accuracy and
reproducibility.
Slide 16 asks you to think about your work in a
2
new context.
Equipment Calibration
How do you know that your equipment is functioning correctly?
Typically we use the equipment in question to measure something whose attributes
(size, volume, etc.) we can determine independently with considerable accuracy, and
then we adjust our equipment to yield the value that we “know” to be correct.
This process is known as “calibrating” the equipment.
In this case, you are calibrating a
micropipettor by determining the mean
(average) mass of multiple aliquots (volume
samples) using an analytical balance, then
converting that mass to a volume using the
density (r) of water, which is a known value.
You will compare the calculated volume with
the delivery volume preset on the pipettor to
determine the accuracy of the pipettor.
Mean =
Sum of sample values
Number of samples
3
Mass to Volume Conversion
Density of Water at 1 atmosphere
Density (r):
Temperature
(ºC)
Density
(g/mL)
Density is simply mass
per unit volume (g/cm3 or
g/mL) as expressed in
the following formula:
15
0.999103
16
0.998946
17
0.998778
18
0.998599
19
0.998408
20
0.998207
21
0.997996
22
0.997774
23
0.997542
24
0.997300
25
0.997048
26
0.996787
27
0.996516
28
0 996237
r = m/V
You use the formula for density
to determine volume delivered
from mass measured.
4
(from: Handbook of Chemistry and Physics, CRC press, 64th Ed.)
Accuracy vs. Reproducibility
Accuracy measures how close a
measured value comes to a
predetermined target value (the
set volume on your pipettor).
accurate
precise
accurate
not precise
Reproducibility (precision)
measures how close repeated
values are to one another. These
concepts can be visualized using
these cartoon (idealized) bulls-eye
diagrams. Notice that accuracy
and precision can vary
independently, so they can be
evaluated independently, as well.
not accurate
precise
not accurate
not precise
5
PROBLEM
You have just been appointed Quality Control Officer in a company responsible for
critical medical assays. The company has recently been having difficulty with
consistency in its medical tests, and it is your task to get to the bottom of the
problem as soon as possible. People’s lives hang in the balance, and mistakes
could have serious legal repercussions for the company.
Pipettors are a mainstay of the laboratory work, and as they have not been
calibrated in a while, you suspect that they might be a source of the variability. On
the other hand, there are some new employees in the company, and their
pipetting technique might be part of the problem.
You decide to have these new employees carry out a standard calibration of the
lab pipettes. Based on the outcome of these tests, you should be able to
determine the source of the error.
•How can you evaluate the error in the pipetting?
•Is it due to poor technique, a faulty pipettor, or some of both?
In the next few slides you will see how an Excel spreadsheet and graphs
can help you to answer these questions. You will be asked to make the
determination in Question 6 of the end of module assignments.
6
Strategy
1. Choose a pipettor and arbitrarily select a volume that lies in the middle of the
pipette’s range to deliver (Why in the middle?). Be sure that you have the
appropriate tip for the pipette.
2.
After taring the analytical balance with a plastic or aluminum “boat” on the
pan of the balance, deliver in quick succession 10 aliquots of water to the
boat, recording the increasing cumulative weight of water on the balance
after each addition. Remember to record the ambient temperature for the
density calculations.
3.
Use a spreadsheet to evaluate the accuracy and reproducibility of the
pipetting. In particular, what are the mean, standard deviation and
percent error of the data sets?
Click here for definitions of the words in
bold, if needed.
7
Strategy (cont.)
1. Before treating your own data, you will work with four sets of sample data to
help you set up the Excel spreadsheet and “learn the ropes”. The first of
these four sets of data will be in the model spreadsheet (Cells C3 – C12) on
Slide 10.
2.
The model spreadsheet provides all the values that you need. Your job is to
create the equations that produce the given values in each cell. Once you
have these equations in place, you can easily treat any new data set or any
new conditions by simple substitution in the appropriate cells (as for the
remaining three data sets).
Note: Cells are color-coded
according to function
= value is given, or chosen.
= calculate; use cell equation.
In any cell you can create
almost any equation that you
need:
= C3+C4
OR………..
8
Strategy (cont.)
Excel can provide you
with equations (functions)
that you can use.
Using equations of your own construction or those provided by
Excel, follow the steps in the following slide to calculate the
mean and standard deviation for the data set given.
9
Creating a Spreadsheet
Recreate this spreadsheet in an Excel file of your own. Insert
the equations necessary to calculate the values in the orange
boxes using the given values in the yellow boxes.
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
B
C
D
Set Volume
6 ul
Sample
Cumulative
Aliquot
Number
Weight (mg) weight (mg)
1
6.001
6.001
2
12.003
6.002
3
18.001
5.998
4
23.998
5.997
5
29.999
6.001
6
35.998
5.999
7
41.996
5.998
8
47.996
6.000
9
53.997
6.001
10
59.996
5.999
Sum=
N=
Mean=
59.996
10
5.9996
Mean=
5.9996
Cell C2 contains the preset
volume for the pipette.
Block C4:C13 contains the
measured weights.
Block D4:D13 contains equations
to calculate the weight of each
aliquot. Hint: use the copy and
paste commands where you can.
Block D15:D17 contains equations to calculate the
mean by summing the aliquot weights and dividing
by the number of them. Hint: for N, you can use
Excel’s COUNT function.
Cell C20 calculates the mean using
Excel’s built-in AVERAGE function.
10
Creating a Spreadsheet (cont…)
Complete the spreadsheet. Add Columns E and F
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
B
C
D
E
Set Volume
6 ul
Difference (D )
Sample
Cumulative
Aliquot
Number
Weight (mg) weight (mg)
from mean
1
6.001
6.001
0.0014
2
12.003
6.002
0.0024
3
18.001
5.998
-0.0016
4
23.998
5.997
-0.0026
5
29.999
6.001
0.0014
6
35.998
5.999
-0.0006
7
41.996
5.998
-0.0016
8
47.996
6.000
0.0004
9
53.997
6.001
0.0014
10
59.996
5.999
-0.0006
F
D2
1.96E-06
5.76E-06
2.56E-06
6.76E-06
1.96E-06
3.60E-07
2.56E-06
1.60E-07
1.96E-06
3.60E-07
Sum=
N=
Mean=
59.996
10
5.9996
Sum =
N -1=
variance=
std dev=
2.44E-05
9
2.71E-06
0.001647
Mean=
5.9996
Std Dev=
0.001647
Block E4:E13 contains
equations that subtract the
mean from the respective
aliquot weights.
Block F4:F13 contains
equations that square those
differences.
Block F15:F18 calculates the
variance and standard deviation
of the aliquot weights. The
variance is the sum of the
differences divided by COUNT-1,
and the standard deviation is the
square root of the variance.
Cell F20 calculates the standard
deviation using Excel’s built-in
STDEV function.
11
Using what you have learned
Build a spreadsheet that
calculates the mean and
standard deviations of these
three additional data sets
together with the data set of the
previous slides. Copy and Paste
Columns B, C, and D from your
spreadsheet of Slide 11, and
Copy and Paste Columns C and
D three times to make space for
the new data. Then insert the
new values in Columns E, G,
and J.
B
2 Set Volume
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Sample
Number
1
2
3
4
5
6
7
8
9
10
Sample Data
Data set 2 Data set 3
Cumulative Cumulative
Wt (mg)
Wt (mg)
6.501
6.1
13.003
12.3
19.501
18.1
25.998
23.8
32.499
29.9
38.998
35.8
45.496
41.6
51.996
47.6
58.497
53.7
64.996
59.996
Data set 4
Cumulative
Wt (mg)
6.6
13.3
19.6
25.8
32.4
38.8
45.1
51.6
58.2
64.6
We will use the
spreadsheet in the
rest of the module to
compare the four
calibration runs.
Remember:
= value is given, or chosen.
= calculate; use cell equation.
C
D
E
F
G
H
I
J
6 ul
Data set 1 Data set 1 Data set 2 Data set 2 Data set 3 Data set 3 Data set 4 Data set 4
Cumulative Aliquot Wt Cumulative Aliquot Wt Cumulative Aliquot Wt Cumulative Aliquot Wt
Wt (mg)
(mg)
Wt (mg)
(mg)
Wt (mg)
(mg)
Wt (mg)
(mg)
6.001
6.001
6.501
6.1
6.6
12.003
6.002
13.003
12.3
13.3
18.001
5.998
19.501
18.1
19.6
23.998
5.997
25.998
23.8
25.8
29.999
6.001
32.499
29.9
32.4
35.998
5.999
38.998
35.8
38.8
41.996
5.998
45.496
41.6
45.1
47.996
6.000
51.996
47.6
51.6
53.997
6.001
58.497
53.7
58.2
59.996
5.999
64.996
59.996
64.996
Mean
5.9996
Std Dev 0.0016465
12
Looking at your results
“But I have weights, not volumes. I
still don’t know anything about the
accuracy of the pipettes!”
Right!
For these calculations, use a
density of 1 g/mL for water.
For your lab data (later), you will
use a density that will depend on
the temperature at which you did
the work. So, be prepared to add a
row to your spreadsheet to convert
the mass of the aliquot to volume.
Remember, you will be using a
spreadsheet like this to calculate several
sets of values, so be sure you
understand all of the steps.
2
3
4
5
6
7
B
Results
Data set
1
2
3
4
C
D
Mean
Std Dev
•Organize the results of your
calculations so you can easily
compare them. Compare the
statistics to the data
themselves (Slide 12).
•What do you notice? (You
should notice something!)
13
Evaluating the results: Accuracy
Look at the four means from Slide 13 and
determine which of the results are the
most accurate (closest to the preset aliquot
volume). Which are least accurate?
9
Relative Error (%)
8
How can you compare the accuracies?
1. Calculate the relative error for the sample
data (use your spreadsheet). (Relative error
is the magnitude of the difference between the
measured and set values divided by the set
value; it can be expressed as a per cent).
7
6
5
4
3
2
1
0
Data
Set 1
Data
Set 2
Data
Set 3
Data
Set 4
0.1
0.09
To create a bar graph,
highlight your spreadsheet and
then enter Chart Wizard in the
menu bar of Excel to create
your graph.
The data sets are so different that you will
need two graphs with different vertical scales.
0.08
Relative Error (%)
2. Use a spreadsheet to create a bar graph
to visualize the difference in accuracy
amongst the four data sets.
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
Data
Set 1
Data
Set 2
Data
Set 3
Data
Set 4
14
Evaluating the results: Reproducibility
Look again at the four data sets (your spreadsheet from Slide 12). Which data seem to vary
most about the mean. Which vary the least? How can you visualize the variability?
Use your spreadsheet from Slide 12 to
make an XY scatter plot like this one.
Aliquot Mass (mg)
6.8
6.6
6.4
6.2
6.0
5.8
5.6
0
1
2
3
4
5
6
7
8
9
10
11
How do the two data sets
marked by blue symbols differ
from the two marked by red
symbols? What do the two
data sets marked by the
circles and squares have in
common? What do the two
data sets marked by the
triangles and diamonds have
in common? How do the four
data sets match up with the
bulls-eyes on Slide 5?
Sample Number
Standard deviation provides a more compressed way to visualize the variability.
Note, however, that standard deviation has limited validity for small data
sets (< 5 individual samples).
15
Evaluating the results: Reproducibility
Look at the four standard deviations that you have calculated and use them to determine
which of the four sets of results are the most precise (most clustered about the mean).
Which are least precise? How does your assessment here compare with that based on
the graph in the previous slide?
6.75
How can you visualize the standard
deviations?
The standard deviations for Data
Sets 1 and 2 plot within the icon
locating the mean
6.50
Aliquot Mass (mg)
Use your spreadsheet from Slide 13 to
draw an X,Y scatter plot showing the
means with error bars that correspond
to the standard deviations. Create the
graph by first plotting the means
against the number of the data set.
Then double click on a data point and
specify the appropriate standard
deviation. Click here for help on adding
the error bars.
6.25
6.00
5.75
0
1
2
3
Data Set
4
5
16
End of Module Assignments
•
Sketch the bulls-eyes of Slide 5 and label them with the number of the data set (Use
your spreadsheet of Slide 12 and consult the graph in Slide 15.)
•
What is the utility of a mean when dealing with data sets? Limitations?
•
What is the utility of the standard deviation? What does it show?
•
What is the utility of a spreadsheet for solving computational problems?
•
What is the utility of graphs in general? In particular, what is the utility of an XY
(scatter) plot as opposed to a bar graph.
•
In your capacity as Quality Control Officer (see Slide 6), how can you determine
whether the recent variability in your lab’s results is due more to poor pipetting
(operator problem) or faulty equipment (instrument problem)?
•
Now that you have your spreadsheet(s) fully formatted, use it (them) to carry out a
similar analysis of your own lab data.
•
A last reflection: Can you see situations in daily life where error/variability/uncertainty
can be considered in similar ways and where such a consideration might affect your
personal decisions? Justify your points with specific examples.
17
Definitions
•
•
•
•
•
•
•
taring – this is a procedure used by scientists to factor out the weight of a
container (or “boat”) when weighing a substance. The scientist measures the
weight of the boat on a balance, then sets the balance equal to zero so that
any subsequent measurements will not include the weight of the boat.
aliquot – a equal fractional part of a whole. For example, if there are 10
aliquots of a mass of water, then each aliquot will be 1/10 the mass of the
original mass.
cumulative – an increase in some value resulting from successive additions
ambient temperature – the temperature of the surrounding environment
mean – the average value (see the equation below)
standard deviation – roughly the average distance between the mean of a
set of numbers and the individual values in that set (see the equation below)
percent error – the difference between the measured and the set values as
a fraction of the set value times 100 (see the equation below)
Mean:
sum of all values
total # of values
Standard Deviation

BACK
2


x

x
o i
n
n 1
Percent Error:
(measured – set) x 100
set
18
How to add error bars
•
•
•
•
•
Once you create your x-y scatter plot, click once on
any of the data points. This should highlight them all.
Under the “Format” menu, select “Selected data
series…”
Select the “Y Error Bars” tab
Under the “Display” menu, highlight “Both”
Under the “Error amount” menu, select “Custom”
(ignore any values in the boxes above it). This will
allow you to select your standard deviations as the
plus and minus amounts.
•
Click on the graph symbol at the right end of the
“+” text bar. This will temporarily close the
“Format Data Series” activity box, but leave open
the text bar.
•
Highlight the four cells with the standard
deviation (beginning with the first and dragging
down to the 4th…it is important that you highlight
them in order so Excel can apply them in order
to your graph).
•
Click on the icon at the right end of the text bar
(there should now be text within the bar). This
will maximize the “Format Data Series” again.
•
Because your standard deviation values applies
to the plus and minus portion of the error bar,
you will repeat this process using the “-” text bar.
•
When you’re done, click “Okay”.
BACK
19
Pre-Test
1. What does it mean to calibrate a piece of laboratory equipment?
2. Distinguish between accuracy and precision by describing the
difference between an inaccurate and an imprecise piece of laboratory
equipment.
3. What does a standard deviation measure?
B
2 Sample
3
1
4
2
5
3
6
4
7
5
8
6
9
7
10
8
11
9
12
10
13
14 Mean
Standard
15 Deviation
C
Cumulative
Wt (g)
0.501
1.003
1.498
1.998
2.49
2.988
3.495
3.996
4.486
4.991
D
Weight of
sample (g)
4.
This spreadsheet shows the cumulative
weights of ten successive samples of powder
added one by one to a weighing pan. The
orange cells are intended to show the weights
of the individual samples, the mean weight,
and the standard deviation of the sample
weights. To complete the spreadsheet, what
cell equations do you need to place in (a) Cell
D10, (b) Cell D14, and (c) D15?
5.
Explain how you would Cut and Paste to
simplify putting the equations into Cells D3
20
through D12.