errors and treatment of data

advertisement
ERRORS AND TREATMENT OF DATA
Error is a measure of inaccuracy of a measurement
Errors can be divided into two classes:1.
Determinate errors (constant errors and systematic errors)
2.
Indeterminate errors (random errors)
They have definite value, determinable and can either be avoided
or corrected.
A determinate error may have the same value under different
conditions and may remain constant from one measurement to another.
Such determinate errors are called Constant Errors e.g the presence of an
impurity in the substance used for the standardization of a solution.
T is also possible for a determinate error to vary in magnitude and
even in sgn from one measurement to another, such determinate errors
are called Systematic Errors.
There are errors which affect the
measurement in a regular and predictable way. Errors which affect the
measurement in a regular and predicable way. Errors in the calibration
of a scale or in the value of a standard mass or in expansion and
contraction of volumetric solution with a change in temperature, are
common examples of systematic errors.
Systematic errors usually
introduce a definite bias one way or the other i.e they will tend to give
either a positive or a negative error.
Systematic errors can always in principle be corrected for and so
eliminated. In practice there are two difficulties, firstly, some systematic
errors may remain undetected, and secondly, it may be impracticable to
calculate the corrections.
To detect unknown systematic errors it is
necessary to make one or more independent checks by using completely
different methods of measuring the same quantity. Where independent
methods give differing results, a systematic error in at least one of the
methods is indicated, and possible sources have then to be looked for.
Types of Determinate Errors (or Sources of Determinate Errors)
The determinate errors that must be taken into consideration
during analysis are numerous.
Four general classes may be
distinguished:
(a)
Instrumental errors and those due to apparatus and reagents
i.
Balance & weights :- These include insufficient sensitivity,
uncelebrated weights etc.
ii.
Volumetric apparatus:- Use of uncelebrated glassware
iii.
Vessels and Utensils:-
e.g introduction of foreign materials
through the attack of glassware
iv.
Reagents Presence of Impurities:- either the same as the
substance sought or interfering substances.
b.
Operative Errors
These are errors that are associated with the analyst himself. They
are independent, to a great extent, of the instruments and utensils
employed, they are not related to the chemical properties of the system
being employed and their magnitude depends more upon the analyst
himself.
Operative errors may be very high in magnitude if the analyst is
inexperienced, careless or thoughtless.
These errors are reduced to
insignificant levels by careful, skilful and understanding work.
Examples of such errors include:1.
Leaving vessels uncovered and thus introducing dust and other
foreign matter into a solution.
2.
Spilling of liquids during purification of standard solutions.
3.
Loss during transfer of filth.
4.
Failure to apply temperature correction in volumetric analysis.
5.
Errors in calculations.
6.
Uses of non representative sample.
It should however be noted that the errors inherent in certain
manipulations can never be entirely eliminated, but they can be reduced
to such a small magnitude by the proper procedure that they will hardly
come into consideration.
(i)
Personal Errors :-
These are personal errors of the analyst. They
originate in the constitutional inability of an individual to make
certain observations accurately. An example is inability to judge
colour changes correctly during titration.
This could be more
serious if the analyst is colour blind. As a result he might be
constantly overshooting the end point.
(ii)
Prejudice:-
When it is, for example, a question of what tenth of a
division is to be taken in reading a scale, the operator is likely to
choose the one that will make the result agree more closely with
the preceding one.
Or if there is some doubt as to the exact location of the end point,
he is most likely to stop the titration at the point which will give a
result is agreement with the previous titration if he knows what
the burette reading should.
(d)
Errors of Method
These are errors that originate form the chemical or micro chemical
proofs of the analytical system. These are the most serious errors
encountered in chemical analysis. This is because these errors are
inherent in the methods and no matter how skillful and careful the
analyst works, the magnitude remains the same unless the
conditions of the determination are altered.
Some sources of methodic errors could be as follows:(i)
In gravimetric analysis, solubility of a precipitate in the wash
liquid.
(ii)
Failure of reaction to go to quantitative completion
(iii)
Decomposition of a precipitate drying.
(iv)
Co-precipitation and post-precipitation
This type of error can be eliminated by the adoption of a better
method.
2.
INDETERMINATE ERRORS
The second class of errors includes the indeterminate errors, often
called accidental or random errors.
They are revealed by small
differences in successive measurement made by the same analyst unde
virtually identical conditions, and they cannot be predicted or estimated.
These accidental errors will follow a random distribution; therefore,
mathematical laws of probability can be applied to arrive at some
conclusion regarding the most probable result of a series of
measurements.
It is beyond the scope of this text to go into mathematical
probability but we can say that indeterminate errors should follow a
normal distribution or Gaussian curv. It is apparent that there should be
few very large errors and that there should be an equal number of
positive and negative errors as shown in figure 1.
Indeterminate errors really riginate in the limited ability of the
analyst to control or make correction for external conditions, or in his
inability to recognize the appearance of factors that will result in errors.
Some random errors stem from the more statistical nature of things, for
example, nuclear counting errors. Sometimes, by changing conditions,
some unknown errors will disappear. Of course, it will be impossible to
eliminate all possible random errors in an experiment, and the analyst
must be content to minimize them to a tolerable or insignificant level.
Limited of Errors
e.g. = (25. 01 + 0.02)cm2
+ 0.02 is the limits of error in the above result.
The limits of error in general assess the magnitude of the random
error which may be present. Limits of error also provide a measure of
the precision of the measurement i.e freedom from random error.
Precision:- is the degree of agreement between replicate measurements
of the same quantity i.e repeatability of a result.
Accuracy:- Accuracy is the degree of agreement between the measure
value and the accepted true value, i.e closeness to the true value. Thus
accuracy implies freedom from systematic error.
Good precision does not assure good accuracy. If there were a
systematic error in the analysis like for example a weight used to
measure each of the samples may be in error, it does not affect the
precision, but it does affect the accuracy. The higher the degree of
precision, the greater the chance of obtaining the true value.
WAYS OF EXPRESSING ACCURACY
There are various ways and units in which the accuracy of a
measurement can be expressed, an accepted true value for comparison
being assumed.
Absolute Error
The difference between the true value and the measured value,
with regard to the sign, is the absolute error, and it is reported in the
same units as the measurement.
If a 2.62-g sample of materials is
analyzed to be 2.52g, the absolute error is – 0.10g. If the measured value
is the average of several measurements. The error is called the mean
error is called the mean error. The mean error can also be calculated by
taking he average difference. With regard to sign of the individual test
results form the true value.
Relative Error
The absolute or means error expressed as a percentage of the true
value is the relative error. The above analysis has a relative error (0.10/2.62) x 100% = 03.8%. The relative accuracy is the measured value
or mean expressed as a percentage of the true value. The above analysis
has a relative accuracy of (2.52/2.62) x 100%
=
96%.
We should
emphasize that neither number is known to be “true” and the relative
error or accuracy is based on the mean of two sets of measurements.
The relative error can be expressed unit other than percentages. In
very accurate work, we are usually dealing with relative errors of less
than 1%, and it is convenient to use a smaller unit. A 1% error is
equivalents to 1 part in 100. It is also equivalent to 10 part in 1000. This
is latter unit is commonly used for expressing small uncertainties. That
is, the uncertainty is expressed in parts per thousand, written as ppt.
The number 23 expressed as parts per thousand of the number 6725
would be 23 part per 6725 or 3.4ppt. pats per thousand is often used in
expressing precision of measurement.
Worked Example:- 1
The results of an analysis are 36.97% compared with the accepted
value of 37.06% what is the relative error in parts thousand?
Solution:
Absolute error = 36.97% - 37.06% = -0.09%
Relative error = 0.09 x 10000/00 = - 2.4ppt
0/
00
indicates parts per thousand, just as % indicated parts per hundred
WAYS OF EXPRESSING PRECISION (OR ESTIMATION OF
RANDOM ERRORS)
1.
The means (x)
The mean, arithmetic means and average are more or less of same.
It is got as a result of dividing the sum (x)of a set of replicate
measurement by the number (N) of readings.
x = x ………….. (1)
N
2.
Means deviation (or Average deviation) d
The mean deviation of the measurement of a set is the mean of the
differences of the individual measurements (x) and the mean (x) of the
measurements without regard to sign.
d = (x – x) …… ………….. (2)
N
3.
Standard deviation () (for a very large set of data)
N > 30
r =
(x - x)2 , Where x = measured value ………….. (3)
N
x = Mean
N = no. of readings
Standard deviation (s)
For N < 30
S = (x - x)2 , Where x = measured value ………….. (3)
N–1
4.
Standard Errors (Standard deviation of the means) - Sm.
Sm =
N.B:
S ……….….. (5)
 N
The standard deviation is a better measure of precision than the
average deviation, especially for a small number of measurements.
5.
Variance: This is defined as the square of the standard deviation.
Variance = S2
……………. (6)
6.
Average deviation of the mean (d mean)
d mean
=
Average deviation
N
Worked examples 2
Calculate the average deviation and the relative average deivation
of the following sets of analytical results: 15.67g, 15.69g, 16.03g.
Solution
x =
15.67 + 15.69 + 16.03
3
x
= 47.39
3
x = 15.80
x
x - x
(ix - xi )2
15.67
15.67 – 15.80
0.13
15.69
15.69 – 15.80
0.11
16.03
16.03 – 15.80
0.23
0.47
1.
Average deviation or absolute average deviation (d)
d =
2.
1x - xi
N
=
0.47
3
=
0.16g
Relative average deviation
= Average Deviation x 100%
Mean
dr = 0.16 x 100%
15.80
dr = 1.0%
OR
2.
Relative average deviation
= Average Deviation x 100ppt
Mean
dr = 0.16 x 100ppt
15.80
dr = 1.0ppt
Worked Example
Given the following sets of weights, 29.8mg, 30.2mg, 28.6mg and
29.7mg, calculate:
(a)
The average deviation and the standard deviation of the
individual values.
(b)
The average deviation of the mean and the standard deviation of
the mean.
Solution:
x =
x
x - x
(ix - xi )2
29.8
0.2
0.04
30.2
0.6
0.36
28.6
1.0
0.01
118.3
1.9
1.41
118.3 =
4
29.6
Average deviation
=
1x - xi
N
Average Deviation (d) =
1.9
4
=
Or
0.48
or 0.48
29.6
29.6
Standard deviation
0.48mg (Absolute)
x
100%
or 0.48
x 1000% (Relative)
29.6
=
1.41
4 -1
=
(b)
0.69mg
Average Deviation of the mean (d mean)
d mean =
Average deviation
N
d mean =
0.48
4
A.D. (mean) = 0.24mg
Standard deviation of the mean (or standard error) - Sm
Sm
= Standard Deviation
N
Sm
=
0.69
N
=
N.B
0.34mg
7.
When relative standard deviation is expressed as a percentage we
have coefficient of variation (C.V)
C. V.
=
S
x 10…………. (8)
x
8.
The Normal Distribution
If S has been determined for a sample consisting of a great many
value readings, it gives a measure of how far individual readings are
likely to be form the true value, in a series of repeated readings one
seems very different from the rest, the whole series should be repeated.
It is bad practice to reject an outlying value characteristic of random
errors that large errors do occur occasionally.
If the frequency curve of random errors is plotted, it is assumed
that their distribution can be represented by the normal frequency curve.
x
Frequency
x
Figure 1: The normal (Gasussian) frequency curve
When we have a very large number of readings, one can say that
68% of the readings will be within +S of the true value,, 95% within +2S
and 99.7% within + 3S.
If must however be noted that how far individual readings are
from the true value is not our main concern because we normally take
several readings and find the mean.
So in actual fact one is more
interested in how far the mean of the readings is form the true value, it
must be emphasized that there is no way of telling what the true value
is. It is however desirable to be able to assign a probability to the mean
lying a certain range of the true value.
This range will depend not only on the spread of the individual
readings but also on N, the number of such readings. This range is in
fact specified by a quantity called the standard deviation of the mean
Sm.
RULES OF COMBINATION OF ERRORS
So far we have considered errors as a single quantity. Most often,
in an experiment, one is estimating a quantity which has incorporated in
it several measured quantities, each with its own error. In such a case
one has to estimate the error in the final answer. The way in which the
individual errors accumulate depends upon the arithmetic relationship
between the terms containing the errors.
(a)
Error Combination in Problems Involving Addition and
Subtraction
The rules of combination of errors as follows:
(i)
For problems involving either subtraction and addition, Absolute
errors (expressed as standard deviations) are used.
(ii)
Suppose on has to evaluate a quantity Z defined as
Z = A + B - C.
The error is Z (i.e Z) will arise partly from A, B and C.
(These errors are from A, B & C respectively). These errors are not
simply added, in that the error in a may work in the opposite direction
to the errors in B or C. So to combine the errors, the square root to the
sum of the squares of individual Absolute error is taken as the resultant
error.
i.e Z =  ( A)2 + ( B)2 + ( C)2 …………… (9)
or Z
= [ ( A)2 + ( B)2 + ( C)2 ] ½
(iii)
For problems involving either subtraction or addition, the
computed Absolute error is rounded off to the same place of
decimal as the final answer.
Worked Example 4
Compute the error involved in the summation
y = 0.05 (+ 0.02) + 4.10 (+0.03) - 1.97(+ 0.05) = 2.63 (+?)
y =  (+ 0.02)2 + (+0.03)2 (+ 0.052)2
y = + 0.06
Thus y = 2.63 + 0.06
(b)
Error Combination In Problems Involving Multiplication and
Division
The rules of combination of error are as follows:
(i)
For problems involving either multiplication or division relative
errors are used.
(ii)
The absolute error for each quantity is converted to relative error.
Suppose one has to evaluate a quantity Z, defined as A x B
C
And the absolute errors in A, B, and C are  A ,  B and  C
respectively, we will first overt these absolute errors to their
respective relative errors: ( A)r, ( B)r and ( C)r, as shown below.
( A)r =  A x
A
100
( B)r =  B x
B
100
( C)r =  C x 100
C
(iii) Combine the errors by taking the square root of the sum the
squares of Individual Relative error.
i. e ( Z)r = (  A)r2 + (  B)r2 (  C)r2 ……………………….. (10)
(iv)
In order to complete the calculation, the above relative error for z
[i.e ( Z)r] is converted to Absolute error (i.e.  Z), as shown below.
Z
(v)
=
( Z)r x Z
100
………………….. ……… (11)
For problems involving Multiplication or Division the computed
Absolute error is rounded off to the same significant figures are
component with the least significant figure.
Worked example 5
Calculate the standard deviation of the result of the following
computation.
y 
2.7 0.28  0.050 0.001
 1.725  10 6
1850 11  42.3 0.4
We will first compute the relative standard deviation of individual
quantities.
( A)r =
0.28 x 100
2.7
=  10.4%
(B)r =
=
0.001 x 100
0.050
 2.0%
(C)r =
 11 x 100
42.3
(D)r =
 0.4 x 100
42.3
=
(Y)r =
=
0.95%
( 10.4)2 + (2.0)2 (0.95)2
 10.6%
The absolute standard deviation will be
y
=
 10.6 x 1.725 x 10–6
SIGNIFICANT FIGURES
These significant figures of a number include all the certain digits
and the first doubtful digit of that number.
In must be noted that the number of significant figures in an
experimentally determined value expresses the precision of its
measurement. For example if an object is weighed to the nearest 0.1 mg
and has the weight of 12.1230g.
There are six significant figures in the value. It would be wrong to
express the value as 12.123g, as this would mean that one is weighting to
the nearest milligram.
On the other hand if the balance is sensitive to just 0.01g, it would
be incorrect to express the result as 12.120g, it should be 12.12g.
Note: The final zero of numbers must never be omitted when they
are significant or included when they are not.
A certain amount of care is required in determining the number of
significant figures to carry in the result of an arithmetic combination of
two ore more numbers. For addition and subtraction of the number of
significant figures can be seen by visual inspection. For example:
3.4 + 0.02 + 1.31 = 4.7
Clearly the second decimal place cannot be significant because an
uncertainty in the first decimal place is introduced by the 3.4.
When data are being multiplied or divided, it is frequently
assumed that the number of significant figures for the result is equal to
that of the component quantity that contains the least number of
significant figures.
For example
24 x 0.452
10.0
=
0.108
=
0.11
Here 24 has two significant figures, and the result has therefore
been rounded to agree.
NOTE
1.
With products and quotients, quote the answer to the same as
the least number of significant figures in the data, or to one
more if the first digit in the answer is 1.
2.
With addition and subtraction quote the answer to the same as
the least number of decimal place occurring in the data.
REJECTION OF A RESULT
Frequently, when a series of replicate analyses are performed, one
of the results will appear to differ markedly from the others. A decision
will have to be made whether to reject the result or to retain it.
Unfortunately, there are no uniform criteria that can be used to decide if
a suspect result can be ascribed to accidental error rather than change
variation. The only reliable basis for rejection is when it can be decided
that some specific error my have been made in obtaining the of doubtful
result. No result should be retained in cases where a known error has
occurred in it collection.
Experience and common sense may serve as just as practical a
basis for judging the validity of a particular observation as a statistical
test would be. Frequently, the experienced analyst will recognize when a
particularly result is suspect.
A wide verity of statistical tests have been suggested an used to
determine whether an observation should be rejected. In all of these a
range is established within which statistically significant observations
should fall. The difficulty with all of them is determining what the range
should be. If it is too small, then perfectly good data will be rejected and
it is too large, then erroneous measurements will be retained too high a
proportion of the time. The Q test is, among the several suggested tests,
one of the most statistically correct for a fairly small number of
observations and is recommended when a test is necessary. The ratio Q
is calculated by arranging the data in decreasing order of numbers. The
difference between the suspect number and its nearest neighbor (W)
divided by the range (R) that is, the difference between the highest
number and the lower number. (i.e Q = W/R as shown figure 2). The ratio
is compared with tabulated values of Q. If it is equal to or greater than
the tabulated value, then the suspected observation can be rejected. The
tabulated values of Q at 90% confifence level are given in Tale 1. If Q
exceeds the for a given number of observations, then the questionable
measurement may be rejected with 90% confidence that some definite
error is in this measurement.
TABLE: Rejection Quotient, Q at 90 Percent Confidence Limit
Number of Observation
Q
3
0.94
4
0.76
5
0.64
6
0.56
7
0.51
8
0.47
9
0.44
10
0.41

0.00
“Adapted from R. B Dean and W. J. Dixon, Anal. Chem, 23 (1951) 636.
EXAMPLE 4.7
The following set of chloride analyses on separate aliquots of a pooled
serum was reported. One value appears suspect. Determine if it can be
ascribed to accidental error. 103, 106, 107, 114meq/liter.
Solution
The suspect result is 114, it differs from it nearest neighbor, 107, by
7 meq/kier. The range is 114-103 or 11 meq/liter. Q is therefore 7/11 =
0.64. Since the calculated Q is less than the tabulated Q, the suspected
number can not be reject.
For a small number of measurements (e.g., three to five) the
discrepancy of the measurement must be quite large before it can be
rejected by this criterion, and it is likely that erroneous results may be
retained. This would cause a significant change is the arithmetic mean,
because the mean is greatly influenced by a discordant value. For this
reason, it has been suggested that the median rather than the mean be
reported when a discordant number can not be rejected from a small
number of measurements. The median has the advantage of not being
unduly influenced by an outlying value. In the above example, the
median could be taken as the average of the two middle values [ = (106 +
107)/2 = 106]. This compares with a mean of 108, which is influenced
more by the suspected number.
The following procedure is suggested for interpretation of the data
of three of five measurements if the precision is considerably poorer
than expected and if one of the observations is considerably different
from the others of the set.
1.
Estimate the precision that can reasonably be expected
for the method in deciding whether a particular
number actually is questionable.
2.
Check the data leading to the suspected number to see
if a definite error can be identified.
3.
If possible, run another analysis. Agreement of the new
result with the apparently valid data previously
collected will lend support to the opinion that the
suspected result should be rejected.
4.
If new data can not be collected, run a Q test.
5.
If the Q test indicates retention fo the outlying number,
consider reporting the median rather than the mean for
a small set of data.
Q 
W
R
R
W
X
X
X
X
Figure 2: Illustration of the calculation of Q
X
Table 2: Values of F at the 95% Confident Level
v1
2
3
4
5
6
7
8
9
10
15
20
30
2
19.0
19.2
19.2
19.3
19.4
19.4
19.4
19.4
19.4
19.4
19.4
19.5
3
9.55
9.28
9.12
9.01
8.94
8.89
8.85
8.81
8.79
8.70
8.66
8.62
4
6.94
6.59
6.39
6.26
6.16
6.09
6.04
6.00
5.96
5.86
5.80
5.75
5
5.79
5.41
5.19
5.05
4.95
4.88
4.82
4.77
4.74
4.62
4.56
4.50
6
5.14
4.76
4.53
4.39
4.28
4.21
4.15
4.10
4.06
3.94
3.87
3.81
7
4.74
4.35
4.12
3.97
3.87
3.79
3.73
3.68
3.64
3.51
3.44
3.38
8
4.46
4.07
3.84
3.69
3.58
3.50
3.44
3.39
3.35
3.22
3.15
3.08
9
4.26
3.86
3.63
3.48
3.37
3.29
3.23
3.18
3.14
3.01
3.94
3.86
10
4.10
3.71
3.48
3.33
3.22
3.14
3.07
3.02
2.98
2.85
2.7
2.70
15
3.68
3.29
3.06
2.90
279
2.71
2.64
2.59
2.54
2.40
2.33
2.25
20
3.49
3.10
2.87
2.71
2.60
2.51
2.45
2.39
2.35
2.20
2.12
2.04
30
3.32
2.92
2.69
2.53
2.42
2.33
2.27
2.21
2.16
2.01
1.93
1.84
v2
TEST OF SIGNIFICANCE
In developing a new analytical method, it is often desirable to
compare the result, the results of that method with those using an
accepted (perhaps standard) method. How, thought, can one tell if
there is a significant difference between the new method and the
accepted one? Again, we resort to statistics to give us the answer.
1.
The F test
This is a test designed to indicate whether there is a significant
difference between two methods based on their standard deviations. F is
defined in terms of the in italics of the two methods, where the variance
is the square of the standard deviation.
F

S12
S 22
……………………………… (12)
Where S12 > S22. There are two different degrees of freedom, vi and
v2, where the degrees of freedom is defined as N – 1, the number of
measurements minus one.
If the calculate F value from Equation 12 exceeds a tabulated F
value at the selected confidence level, and then is a significant difference
between the two methods. A list of F values at the 95% confidence level
is given in Table 2.
Worked examples 6
You are developing a new colorimetric procedure for determining
the glucose content of blood serum. You have chosen the standard FolinWu procedure with which to compare your results. From the following
two sets of replicate analyses on the same sample, determine whether
the variance of your method differs significantly from that of the
standard method.
Your Method (mg/d1)
Folin-Wu method (mg/d1)
127
130
125
128
1234
131
130
129
131
127
126
125
129
-
Mean (x1) 127
Mean (x2) 128
Solution

x  x 

x
2
2
1
S
2
1
S
F1 
1
1
N1  1

 x2
N2  1
2
S12
S22


50
 8.3,
7 1

24
 4.8,
7 1
2
8.3
4.8
 1.73
The variances are arranged so that F value is > 1. The tabulated F
value for u1 = 6 and u2 = 5 is 4.95. Since the calculated vale is less that
this, we conclude that there is no significant difference in the precession
of the two methods.
The Student t Test
In this method, comparison is made between two sets of replicate
measurements made by two different methods, one of them will be the
test, method, and the other will be an accepted method. A statistical t
value is calculated and compared with a tabulated value for the given
number, of tests at the desired confidence level (Table 3). If the calculate
t value exceeds the tabulated t value, then there is a significant
difference between the results by the two methods at that confidence
level. If it does not exceed the tabulated value, then we can predict that
there is no significant difference between the methods. This in no way
implies that the two results are identical.
A test is made to determine whether a method under
consideration givers significantly result for a variety of samples when
compared to results obtained by another method for each sample, we
assume that both methods have essentially the same standard deviation
and that this does not depend on the type of sample. This can be verified
using the t test above or replicate analyses on a single sample. The t
value is calculated from:
t

D
Sd
N
……………………………… (13)
 D  D 
2
Sd
Table 3:

N 1
Values of t for v Degrees of Freedom for Various
Confidence level
Confidence levels
Level
v
%
1
90
95
99
99.5
6.314
12.706
63.657
127.32
2
2.920
4.303
9.925
14.089
3
2.353
3182
5.841
7.453
4
2.132
2.776
4.604
5.598
5
2.015
2.571
4.032
4.773
6
1.943
2.447
3.707
4.317
7
1.895
2.365
3.500
4.029
8
1.860
2.306
3.355
3.832
9
1.833
2.262
3.250
3.690
10
1.812
2.228
3.169
3.581
15
1.753
2.131
2.947
3.252
20
1.725
2.086
2.845
3.153
25
1.708
2.060
2.787
3.078

1.645
1.960
2.576
2.807
v = N - 1 = degrees of freedom
Di
=
The individual differences between the two method for
each sample, with regard to sign
D
=
The mean of all the individual differences
Worked examples 7:-
You are developing a new analytical method for
the determination of blood urea nitrogen (BUN). You want to determine
whether you method differs significantly from a standard on from
analyzing a rane of sample concentrations expected to the found in the
routine laboratory. The following are two sets of results for a number of
individual samples.
Sample
(mg/l)
(mg/d1)
Di
Di – D
(Di-D)2
A
10.2
10.5
-0.3
-0.6
0.36
B
12.7
11.9
0.8
0.5
0.25
C
8.6
16.9
-0.1
-0.4
0.16
D
17.5
16.9
0.6
0.3
0.09
E
11.2
10.9
0.3
0.0
0.00
F
11.5
11.1
0.4
0.1
0.01
 1.7
D  0.28
Solution
0.87
6 1
 0.42

Sd
Sd
t

D
Sd
t

0.28

0.42
N
6
 0.87
t = 1.63
The tabulated t value at the 95% confidence level for 5 degrees of
freedom is 2.571. Therefore, tcalc > ttable, and there is no significant
difference between the two methods at this confidence level.
Usually, a test at the 95% confidence level is considered significant,
while one at the 99% level is highly significant. That is, the smaller the
calculated t value, the more confident you are the that there is no
significant difference between the two methods. If you employ too low a
confidence level (e.g., 08%) you are likely to conclude erroneously, that
there are is a significant difference between the two methods. On the
other hand, to high a confidence level will require too large a difference
to detect. If a calculated t value is near the tabular value at the 95%
confidence level, more tests should be run to ascertain definitely
whether the two methods are significantly different.
THE CORRELATION COEFFICIENT
The correlation coefficient is used as a measure of the correlation
between two variables. When variable x and y and correlated rather than
being functionally related (i.e. are not directly dependent upon one
another), we do not speak of the “best” y value corresponding to a given
x value, but only of the most probable values, the more definite is the
relationship between x and y. This postulated is the basis for various
numerical measures of the degree of correlation.
The Pearson correlation coefficient is one of the most convenient to
calculate. This is given by:
Where r is the correlation coefficient, n is the number of
observations, sx is the standard deviation of xsy is the standard of xi and
yi the individual value of the variable x and y, respectively, and x and y
are their means. The use of differences is the calculation is frequently
cumbersome, and the equation can be transformed to a more convenient
form.
Download