errors and treatment of data

ERRORS AND TREATMENT OF DATA Error is a measure of inaccuracy of a measurement Errors can be divided into two classes:1. Determinate errors (constant errors and systematic errors) 2. Indeterminate errors (random errors) They have definite value, determinable and can either be avoided or corrected. A determinate error may have the same value under different conditions and may remain constant from one measurement to another. Such determinate errors are called Constant Errors e.g the presence of an impurity in the substance used for the standardization of a solution. T is also possible for a determinate error to vary in magnitude and even in sgn from one measurement to another, such determinate errors are called Systematic Errors. There are errors which affect the measurement in a regular and predictable way. Errors which affect the measurement in a regular and predicable way. Errors in the calibration of a scale or in the value of a standard mass or in expansion and contraction of volumetric solution with a change in temperature, are common examples of systematic errors. Systematic errors usually introduce a definite bias one way or the other i.e they will tend to give either a positive or a negative error. Systematic errors can always in principle be corrected for and so eliminated. In practice there are two difficulties, firstly, some systematic errors may remain undetected, and secondly, it may be impracticable to calculate the corrections. To detect unknown systematic errors it is necessary to make one or more independent checks by using completely different methods of measuring the same quantity. Where independent methods give differing results, a systematic error in at least one of the methods is indicated, and possible sources have then to be looked for. Types of Determinate Errors (or Sources of Determinate Errors) The determinate errors that must be taken into consideration during analysis are numerous. Four general classes may be distinguished: (a) Instrumental errors and those due to apparatus and reagents i. Balance & weights :- These include insufficient sensitivity, uncelebrated weights etc. ii. Volumetric apparatus:- Use of uncelebrated glassware iii. Vessels and Utensils:- e.g introduction of foreign materials through the attack of glassware iv. Reagents Presence of Impurities:- either the same as the substance sought or interfering substances. b. Operative Errors These are errors that are associated with the analyst himself. They are independent, to a great extent, of the instruments and utensils employed, they are not related to the chemical properties of the system being employed and their magnitude depends more upon the analyst himself. Operative errors may be very high in magnitude if the analyst is inexperienced, careless or thoughtless. These errors are reduced to insignificant levels by careful, skilful and understanding work. Examples of such errors include:1. Leaving vessels uncovered and thus introducing dust and other foreign matter into a solution. 2. Spilling of liquids during purification of standard solutions. 3. Loss during transfer of filth. 4. Failure to apply temperature correction in volumetric analysis. 5. Errors in calculations. 6. Uses of non representative sample. It should however be noted that the errors inherent in certain manipulations can never be entirely eliminated, but they can be reduced to such a small magnitude by the proper procedure that they will hardly come into consideration. (i) Personal Errors :- These are personal errors of the analyst. They originate in the constitutional inability of an individual to make certain observations accurately. An example is inability to judge colour changes correctly during titration. This could be more serious if the analyst is colour blind. As a result he might be constantly overshooting the end point. (ii) Prejudice:- When it is, for example, a question of what tenth of a division is to be taken in reading a scale, the operator is likely to choose the one that will make the result agree more closely with the preceding one. Or if there is some doubt as to the exact location of the end point, he is most likely to stop the titration at the point which will give a result is agreement with the previous titration if he knows what the burette reading should. (d) Errors of Method These are errors that originate form the chemical or micro chemical proofs of the analytical system. These are the most serious errors encountered in chemical analysis. This is because these errors are inherent in the methods and no matter how skillful and careful the analyst works, the magnitude remains the same unless the conditions of the determination are altered. Some sources of methodic errors could be as follows:(i) In gravimetric analysis, solubility of a precipitate in the wash liquid. (ii) Failure of reaction to go to quantitative completion (iii) Decomposition of a precipitate drying. (iv) Co-precipitation and post-precipitation This type of error can be eliminated by the adoption of a better method. 2. INDETERMINATE ERRORS The second class of errors includes the indeterminate errors, often called accidental or random errors. They are revealed by small differences in successive measurement made by the same analyst unde virtually identical conditions, and they cannot be predicted or estimated. These accidental errors will follow a random distribution; therefore, mathematical laws of probability can be applied to arrive at some conclusion regarding the most probable result of a series of measurements. It is beyond the scope of this text to go into mathematical probability but we can say that indeterminate errors should follow a normal distribution or Gaussian curv. It is apparent that there should be few very large errors and that there should be an equal number of positive and negative errors as shown in figure 1. Indeterminate errors really riginate in the limited ability of the analyst to control or make correction for external conditions, or in his inability to recognize the appearance of factors that will result in errors. Some random errors stem from the more statistical nature of things, for example, nuclear counting errors. Sometimes, by changing conditions, some unknown errors will disappear. Of course, it will be impossible to eliminate all possible random errors in an experiment, and the analyst must be content to minimize them to a tolerable or insignificant level. Limited of Errors e.g. = (25. 01 + 0.02)cm2 + 0.02 is the limits of error in the above result. The limits of error in general assess the magnitude of the random error which may be present. Limits of error also provide a measure of the precision of the measurement i.e freedom from random error. Precision:- is the degree of agreement between replicate measurements of the same quantity i.e repeatability of a result. Accuracy:- Accuracy is the degree of agreement between the measure value and the accepted true value, i.e closeness to the true value. Thus accuracy implies freedom from systematic error. Good precision does not assure good accuracy. If there were a systematic error in the analysis like for example a weight used to measure each of the samples may be in error, it does not affect the precision, but it does affect the accuracy. The higher the degree of precision, the greater the chance of obtaining the true value. WAYS OF EXPRESSING ACCURACY There are various ways and units in which the accuracy of a measurement can be expressed, an accepted true value for comparison being assumed. Absolute Error The difference between the true value and the measured value, with regard to the sign, is the absolute error, and it is reported in the same units as the measurement. If a 2.62-g sample of materials is analyzed to be 2.52g, the absolute error is – 0.10g. If the measured value is the average of several measurements. The error is called the mean error is called the mean error. The mean error can also be calculated by taking he average difference. With regard to sign of the individual test results form the true value. Relative Error The absolute or means error expressed as a percentage of the true value is the relative error. The above analysis has a relative error (0.10/2.62) x 100% = 03.8%. The relative accuracy is the measured value or mean expressed as a percentage of the true value. The above analysis has a relative accuracy of (2.52/2.62) x 100% = 96%. We should emphasize that neither number is known to be “true” and the relative error or accuracy is based on the mean of two sets of measurements. The relative error can be expressed unit other than percentages. In very accurate work, we are usually dealing with relative errors of less than 1%, and it is convenient to use a smaller unit. A 1% error is equivalents to 1 part in 100. It is also equivalent to 10 part in 1000. This is latter unit is commonly used for expressing small uncertainties. That is, the uncertainty is expressed in parts per thousand, written as ppt. The number 23 expressed as parts per thousand of the number 6725 would be 23 part per 6725 or 3.4ppt. pats per thousand is often used in expressing precision of measurement. Worked Example:- 1 The results of an analysis are 36.97% compared with the accepted value of 37.06% what is the relative error in parts thousand? Solution: Absolute error = 36.97% - 37.06% = -0.09% Relative error = 0.09 x 10000/00 = - 2.4ppt 0/ 00 indicates parts per thousand, just as % indicated parts per hundred WAYS OF EXPRESSING PRECISION (OR ESTIMATION OF RANDOM ERRORS) 1. The means (x) The mean, arithmetic means and average are more or less of same. It is got as a result of dividing the sum (x)of a set of replicate measurement by the number (N) of readings. x = x ………….. (1) N 2. Means deviation (or Average deviation) d The mean deviation of the measurement of a set is the mean of the differences of the individual measurements (x) and the mean (x) of the measurements without regard to sign. d = (x – x) …… ………….. (2) N 3. Standard deviation () (for a very large set of data) N > 30 r = (x - x)2 , Where x = measured value ………….. (3) N x = Mean N = no. of readings Standard deviation (s) For N < 30 S = (x - x)2 , Where x = measured value ………….. (3) N–1 4. Standard Errors (Standard deviation of the means) - Sm. Sm = N.B: S ……….….. (5)  N The standard deviation is a better measure of precision than the average deviation, especially for a small number of measurements. 5. Variance: This is defined as the square of the standard deviation. Variance = S2 ……………. (6) 6. Average deviation of the mean (d mean) d mean = Average deviation N Worked examples 2 Calculate the average deviation and the relative average deivation of the following sets of analytical results: 15.67g, 15.69g, 16.03g. Solution x = 15.67 + 15.69 + 16.03 3 x = 47.39 3 x = 15.80 x x - x (ix - xi )2 15.67 15.67 – 15.80 0.13 15.69 15.69 – 15.80 0.11 16.03 16.03 – 15.80 0.23 0.47 1. Average deviation or absolute average deviation (d) d = 2. 1x - xi N = 0.47 3 = 0.16g Relative average deviation = Average Deviation x 100% Mean dr = 0.16 x 100% 15.80 dr = 1.0% OR 2. Relative average deviation = Average Deviation x 100ppt Mean dr = 0.16 x 100ppt 15.80 dr = 1.0ppt Worked Example Given the following sets of weights, 29.8mg, 30.2mg, 28.6mg and 29.7mg, calculate: (a) The average deviation and the standard deviation of the individual values. (b) The average deviation of the mean and the standard deviation of the mean. Solution: x = x x - x (ix - xi )2 29.8 0.2 0.04 30.2 0.6 0.36 28.6 1.0 0.01 118.3 1.9 1.41 118.3 = 4 29.6 Average deviation = 1x - xi N Average Deviation (d) = 1.9 4 = Or 0.48 or 0.48 29.6 29.6 Standard deviation 0.48mg (Absolute) x 100% or 0.48 x 1000% (Relative) 29.6 = 1.41 4 -1 = (b) 0.69mg Average Deviation of the mean (d mean) d mean = Average deviation N d mean = 0.48 4 A.D. (mean) = 0.24mg Standard deviation of the mean (or standard error) - Sm Sm = Standard Deviation N Sm = 0.69 N = N.B 0.34mg 7. When relative standard deviation is expressed as a percentage we have coefficient of variation (C.V) C. V. = S x 10…………. (8) x 8. The Normal Distribution If S has been determined for a sample consisting of a great many value readings, it gives a measure of how far individual readings are likely to be form the true value, in a series of repeated readings one seems very different from the rest, the whole series should be repeated. It is bad practice to reject an outlying value characteristic of random errors that large errors do occur occasionally. If the frequency curve of random errors is plotted, it is assumed that their distribution can be represented by the normal frequency curve. x Frequency x Figure 1: The normal (Gasussian) frequency curve When we have a very large number of readings, one can say that 68% of the readings will be within +S of the true value,, 95% within +2S and 99.7% within + 3S. If must however be noted that how far individual readings are from the true value is not our main concern because we normally take several readings and find the mean. So in actual fact one is more interested in how far the mean of the readings is form the true value, it must be emphasized that there is no way of telling what the true value is. It is however desirable to be able to assign a probability to the mean lying a certain range of the true value. This range will depend not only on the spread of the individual readings but also on N, the number of such readings. This range is in fact specified by a quantity called the standard deviation of the mean Sm. RULES OF COMBINATION OF ERRORS So far we have considered errors as a single quantity. Most often, in an experiment, one is estimating a quantity which has incorporated in it several measured quantities, each with its own error. In such a case one has to estimate the error in the final answer. The way in which the individual errors accumulate depends upon the arithmetic relationship between the terms containing the errors. (a) Error Combination in Problems Involving Addition and Subtraction The rules of combination of errors as follows: (i) For problems involving either subtraction and addition, Absolute errors (expressed as standard deviations) are used. (ii) Suppose on has to evaluate a quantity Z defined as Z = A + B - C. The error is Z (i.e Z) will arise partly from A, B and C. (These errors are from A, B & C respectively). These errors are not simply added, in that the error in a may work in the opposite direction to the errors in B or C. So to combine the errors, the square root to the sum of the squares of individual Absolute error is taken as the resultant error. i.e Z =  ( A)2 + ( B)2 + ( C)2 …………… (9) or Z = [ ( A)2 + ( B)2 + ( C)2 ] ½ (iii) For problems involving either subtraction or addition, the computed Absolute error is rounded off to the same place of decimal as the final answer. Worked Example 4 Compute the error involved in the summation y = 0.05 (+ 0.02) + 4.10 (+0.03) - 1.97(+ 0.05) = 2.63 (+?) y =  (+ 0.02)2 + (+0.03)2 (+ 0.052)2 y = + 0.06 Thus y = 2.63 + 0.06 (b) Error Combination In Problems Involving Multiplication and Division The rules of combination of error are as follows: (i) For problems involving either multiplication or division relative errors are used. (ii) The absolute error for each quantity is converted to relative error. Suppose one has to evaluate a quantity Z, defined as A x B C And the absolute errors in A, B, and C are  A ,  B and  C respectively, we will first overt these absolute errors to their respective relative errors: ( A)r, ( B)r and ( C)r, as shown below. ( A)r =  A x A 100 ( B)r =  B x B 100 ( C)r =  C x 100 C (iii) Combine the errors by taking the square root of the sum the squares of Individual Relative error. i. e ( Z)r = (  A)r2 + (  B)r2 (  C)r2 ……………………….. (10) (iv) In order to complete the calculation, the above relative error for z [i.e ( Z)r] is converted to Absolute error (i.e.  Z), as shown below. Z (v) = ( Z)r x Z 100 ………………….. ……… (11) For problems involving Multiplication or Division the computed Absolute error is rounded off to the same significant figures are component with the least significant figure. Worked example 5 Calculate the standard deviation of the result of the following computation. y  2.7 0.28  0.050 0.001  1.725  10 6 1850 11  42.3 0.4 We will first compute the relative standard deviation of individual quantities. ( A)r = 0.28 x 100 2.7 =  10.4% (B)r = = 0.001 x 100 0.050  2.0% (C)r =  11 x 100 42.3 (D)r =  0.4 x 100 42.3 = (Y)r = = 0.95% ( 10.4)2 + (2.0)2 (0.95)2  10.6% The absolute standard deviation will be y =  10.6 x 1.725 x 10–6 SIGNIFICANT FIGURES These significant figures of a number include all the certain digits and the first doubtful digit of that number. In must be noted that the number of significant figures in an experimentally determined value expresses the precision of its measurement. For example if an object is weighed to the nearest 0.1 mg and has the weight of 12.1230g. There are six significant figures in the value. It would be wrong to express the value as 12.123g, as this would mean that one is weighting to the nearest milligram. On the other hand if the balance is sensitive to just 0.01g, it would be incorrect to express the result as 12.120g, it should be 12.12g. Note: The final zero of numbers must never be omitted when they are significant or included when they are not. A certain amount of care is required in determining the number of significant figures to carry in the result of an arithmetic combination of two ore more numbers. For addition and subtraction of the number of significant figures can be seen by visual inspection. For example: 3.4 + 0.02 + 1.31 = 4.7 Clearly the second decimal place cannot be significant because an uncertainty in the first decimal place is introduced by the 3.4. When data are being multiplied or divided, it is frequently assumed that the number of significant figures for the result is equal to that of the component quantity that contains the least number of significant figures. For example 24 x 0.452 10.0 = 0.108 = 0.11 Here 24 has two significant figures, and the result has therefore been rounded to agree. NOTE 1. With products and quotients, quote the answer to the same as the least number of significant figures in the data, or to one more if the first digit in the answer is 1. 2. With addition and subtraction quote the answer to the same as the least number of decimal place occurring in the data. REJECTION OF A RESULT Frequently, when a series of replicate analyses are performed, one of the results will appear to differ markedly from the others. A decision will have to be made whether to reject the result or to retain it. Unfortunately, there are no uniform criteria that can be used to decide if a suspect result can be ascribed to accidental error rather than change variation. The only reliable basis for rejection is when it can be decided that some specific error my have been made in obtaining the of doubtful result. No result should be retained in cases where a known error has occurred in it collection. Experience and common sense may serve as just as practical a basis for judging the validity of a particular observation as a statistical test would be. Frequently, the experienced analyst will recognize when a particularly result is suspect. A wide verity of statistical tests have been suggested an used to determine whether an observation should be rejected. In all of these a range is established within which statistically significant observations should fall. The difficulty with all of them is determining what the range should be. If it is too small, then perfectly good data will be rejected and it is too large, then erroneous measurements will be retained too high a proportion of the time. The Q test is, among the several suggested tests, one of the most statistically correct for a fairly small number of observations and is recommended when a test is necessary. The ratio Q is calculated by arranging the data in decreasing order of numbers. The difference between the suspect number and its nearest neighbor (W) divided by the range (R) that is, the difference between the highest number and the lower number. (i.e Q = W/R as shown figure 2). The ratio is compared with tabulated values of Q. If it is equal to or greater than the tabulated value, then the suspected observation can be rejected. The tabulated values of Q at 90% confifence level are given in Tale 1. If Q exceeds the for a given number of observations, then the questionable measurement may be rejected with 90% confidence that some definite error is in this measurement. TABLE: Rejection Quotient, Q at 90 Percent Confidence Limit Number of Observation Q 3 0.94 4 0.76 5 0.64 6 0.56 7 0.51 8 0.47 9 0.44 10 0.41  0.00 “Adapted from R. B Dean and W. J. Dixon, Anal. Chem, 23 (1951) 636. EXAMPLE 4.7 The following set of chloride analyses on separate aliquots of a pooled serum was reported. One value appears suspect. Determine if it can be ascribed to accidental error. 103, 106, 107, 114meq/liter. Solution The suspect result is 114, it differs from it nearest neighbor, 107, by 7 meq/kier. The range is 114-103 or 11 meq/liter. Q is therefore 7/11 = 0.64. Since the calculated Q is less than the tabulated Q, the suspected number can not be reject. For a small number of measurements (e.g., three to five) the discrepancy of the measurement must be quite large before it can be rejected by this criterion, and it is likely that erroneous results may be retained. This would cause a significant change is the arithmetic mean, because the mean is greatly influenced by a discordant value. For this reason, it has been suggested that the median rather than the mean be reported when a discordant number can not be rejected from a small number of measurements. The median has the advantage of not being unduly influenced by an outlying value. In the above example, the median could be taken as the average of the two middle values [ = (106 + 107)/2 = 106]. This compares with a mean of 108, which is influenced more by the suspected number. The following procedure is suggested for interpretation of the data of three of five measurements if the precision is considerably poorer than expected and if one of the observations is considerably different from the others of the set. 1. Estimate the precision that can reasonably be expected for the method in deciding whether a particular number actually is questionable. 2. Check the data leading to the suspected number to see if a definite error can be identified. 3. If possible, run another analysis. Agreement of the new result with the apparently valid data previously collected will lend support to the opinion that the suspected result should be rejected. 4. If new data can not be collected, run a Q test. 5. If the Q test indicates retention fo the outlying number, consider reporting the median rather than the mean for a small set of data. Q  W R R W X X X X Figure 2: Illustration of the calculation of Q X Table 2: Values of F at the 95% Confident Level v1 2 3 4 5 6 7 8 9 10 15 20 30 2 19.0 19.2 19.2 19.3 19.4 19.4 19.4 19.4 19.4 19.4 19.4 19.5 3 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79 8.70 8.66 8.62 4 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96 5.86 5.80 5.75 5 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74 4.62 4.56 4.50 6 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06 3.94 3.87 3.81 7 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.64 3.51 3.44 3.38 8 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.35 3.22 3.15 3.08 9 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.14 3.01 3.94 3.86 10 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.98 2.85 2.7 2.70 15 3.68 3.29 3.06 2.90 279 2.71 2.64 2.59 2.54 2.40 2.33 2.25 20 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 2.35 2.20 2.12 2.04 30 3.32 2.92 2.69 2.53 2.42 2.33 2.27 2.21 2.16 2.01 1.93 1.84 v2 TEST OF SIGNIFICANCE In developing a new analytical method, it is often desirable to compare the result, the results of that method with those using an accepted (perhaps standard) method. How, thought, can one tell if there is a significant difference between the new method and the accepted one? Again, we resort to statistics to give us the answer. 1. The F test This is a test designed to indicate whether there is a significant difference between two methods based on their standard deviations. F is defined in terms of the in italics of the two methods, where the variance is the square of the standard deviation. F  S12 S 22 ……………………………… (12) Where S12 > S22. There are two different degrees of freedom, vi and v2, where the degrees of freedom is defined as N – 1, the number of measurements minus one. If the calculate F value from Equation 12 exceeds a tabulated F value at the selected confidence level, and then is a significant difference between the two methods. A list of F values at the 95% confidence level is given in Table 2. Worked examples 6 You are developing a new colorimetric procedure for determining the glucose content of blood serum. You have chosen the standard FolinWu procedure with which to compare your results. From the following two sets of replicate analyses on the same sample, determine whether the variance of your method differs significantly from that of the standard method. Your Method (mg/d1) Folin-Wu method (mg/d1) 127 130 125 128 1234 131 130 129 131 127 126 125 129 - Mean (x1) 127 Mean (x2) 128 Solution  x  x   x 2 2 1 S 2 1 S F1  1 1 N1  1   x2 N2  1 2 S12 S22   50  8.3, 7 1  24  4.8, 7 1 2 8.3 4.8  1.73 The variances are arranged so that F value is > 1. The tabulated F value for u1 = 6 and u2 = 5 is 4.95. Since the calculated vale is less that this, we conclude that there is no significant difference in the precession of the two methods. The Student t Test In this method, comparison is made between two sets of replicate measurements made by two different methods, one of them will be the test, method, and the other will be an accepted method. A statistical t value is calculated and compared with a tabulated value for the given number, of tests at the desired confidence level (Table 3). If the calculate t value exceeds the tabulated t value, then there is a significant difference between the results by the two methods at that confidence level. If it does not exceed the tabulated value, then we can predict that there is no significant difference between the methods. This in no way implies that the two results are identical. A test is made to determine whether a method under consideration givers significantly result for a variety of samples when compared to results obtained by another method for each sample, we assume that both methods have essentially the same standard deviation and that this does not depend on the type of sample. This can be verified using the t test above or replicate analyses on a single sample. The t value is calculated from: t  D Sd N ……………………………… (13)  D  D  2 Sd Table 3:  N 1 Values of t for v Degrees of Freedom for Various Confidence level Confidence levels Level v % 1 90 95 99 99.5 6.314 12.706 63.657 127.32 2 2.920 4.303 9.925 14.089 3 2.353 3182 5.841 7.453 4 2.132 2.776 4.604 5.598 5 2.015 2.571 4.032 4.773 6 1.943 2.447 3.707 4.317 7 1.895 2.365 3.500 4.029 8 1.860 2.306 3.355 3.832 9 1.833 2.262 3.250 3.690 10 1.812 2.228 3.169 3.581 15 1.753 2.131 2.947 3.252 20 1.725 2.086 2.845 3.153 25 1.708 2.060 2.787 3.078  1.645 1.960 2.576 2.807 v = N - 1 = degrees of freedom Di = The individual differences between the two method for each sample, with regard to sign D = The mean of all the individual differences Worked examples 7:- You are developing a new analytical method for the determination of blood urea nitrogen (BUN). You want to determine whether you method differs significantly from a standard on from analyzing a rane of sample concentrations expected to the found in the routine laboratory. The following are two sets of results for a number of individual samples. Sample (mg/l) (mg/d1) Di Di – D (Di-D)2 A 10.2 10.5 -0.3 -0.6 0.36 B 12.7 11.9 0.8 0.5 0.25 C 8.6 16.9 -0.1 -0.4 0.16 D 17.5 16.9 0.6 0.3 0.09 E 11.2 10.9 0.3 0.0 0.00 F 11.5 11.1 0.4 0.1 0.01  1.7 D  0.28 Solution 0.87 6 1  0.42  Sd Sd t  D Sd t  0.28  0.42 N 6  0.87 t = 1.63 The tabulated t value at the 95% confidence level for 5 degrees of freedom is 2.571. Therefore, tcalc > ttable, and there is no significant difference between the two methods at this confidence level. Usually, a test at the 95% confidence level is considered significant, while one at the 99% level is highly significant. That is, the smaller the calculated t value, the more confident you are the that there is no significant difference between the two methods. If you employ too low a confidence level (e.g., 08%) you are likely to conclude erroneously, that there are is a significant difference between the two methods. On the other hand, to high a confidence level will require too large a difference to detect. If a calculated t value is near the tabular value at the 95% confidence level, more tests should be run to ascertain definitely whether the two methods are significantly different. THE CORRELATION COEFFICIENT The correlation coefficient is used as a measure of the correlation between two variables. When variable x and y and correlated rather than being functionally related (i.e. are not directly dependent upon one another), we do not speak of the “best” y value corresponding to a given x value, but only of the most probable values, the more definite is the relationship between x and y. This postulated is the basis for various numerical measures of the degree of correlation. The Pearson correlation coefficient is one of the most convenient to calculate. This is given by: Where r is the correlation coefficient, n is the number of observations, sx is the standard deviation of xsy is the standard of xi and yi the individual value of the variable x and y, respectively, and x and y are their means. The use of differences is the calculation is frequently cumbersome, and the equation can be transformed to a more convenient form.

errors and treatment of data

Related documents

Products

Support

errors and treatment of data

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib