EXPERIMENT 5. STATISTICS IN SCIENTIFIC MEASUREMENTS AND THE USAGE OF A SPREADSHEET PROGRAM Part I. Basic Statistics and Graphing (mainly for CHEM 1421 Lab) Section A. Average, Error, Deviation, Accuracy, Precision Section B. Scientific Plotting with a Spreadsheet. Part II. Intermediate Statistics (mainly for CHEM 1422 Lab) Section A. Propagation of Random Errors Section B. Least Squares Analysis (Part II is not included in this Lab, please refer to the Contents under CHEM 1422 Lab.) PART I. BASIC STATISTICS AND GRAPHING ( CHEM 1421 Lab ) OBJECTIVE To review several basic concepts in statistics for scientific measurements and data analysis, to prepare a graph from data, and to introduce the usage of a spreadsheet software for these practices. BACKGROUND Many scientific activities involve measuring the values of some quantities. Because of various limitations in the quantitative measurements, the measured quantities always involve an error which is an uncertainty in the measured value. Uncertainty in a measurement can be given in terms of both accuracy and precision. Section A. Average, Error and Deviation, Accuracy and Precision. Accuracy tells us how close a measured value (or an average of a set of measured values) is to the true value, and it is given in terms of an error. It is defined as an absolute value of the difference between the true value (xtr) and a measured value (xi , or an average of measured value) .“The smaller an error, the more accurate the measurements are.” Absolute Error = True Value – Measured Value = xtr - xi for an error in an individual value 1 = xtr - xav for an error in average value % Relative Error = (Absolute Error/True Value) x 100. = xtr - xi/xtr x 100 for an individual value = xtr – xav/xtr x 100 for an average value The average value or mean value is defined as xav = xi/n Precision tells us how consistent the measurements are within a set of measured data. It is a measure of reproducibility of the measurements. The concept of precision only applies to a set of multiple data, not to a single and separate measurement. Precision is normally given in terms of a deviation. Deviation of a single individual measurement is defined as an absolute value of the difference between the average value and a measured value. “The smaller a deviation, the more precise the measurements are.” Deviation = Average Value – Measured Value i.e., di = xav - xi Deviation for a set of measured values can be given in terms of either an average deviation or a standard deviation, and the latter is more useful. Average Deviation = (Sum of all Deviations)/(# of Measurements) i.e., dav =di/n where, n is a number of data % Relative Average Deviation = dav/xav x 100 Standard Deviation = [(Sum of all deviations squared/ (# of measurements – 1)]1/2 dst = s = [ (xav - xi2/(n-1))]1/2 = [(di2/(n-1))]1/2 Variance is a square of a standard deviation: V = s2 = di2/(n-1) A relative standard deviation can be given as a percentage of an average, and this is called the coefficient of variation, v. coefficient of variation (or % relative standard deviation): v = (s/xav) X 100 The spread between the average value and the experimental measurements is best represented by the standard deviation, s. The probability that measurements lie between xav ½ s is 38.3 %, xav s is 68.3 %, xav 2s is 95.5 %, xav 3s is 99.7%, 2 Often a true value is not known or cannot be determined accurately, and an average from many reliable trials can be accepted as a true value. Then, the numerical values for both an error and deviation become the same. For example, if a large number of weighings was made for pennies, the average mass of the penny is accepted as the true value. Example #1: Five different pennies were weighed. Each individuals’ masses (in grams) were 3.088, 3.063, 2.477, 2.518, 3.114 g. Assume that true (accepted ) value is 2.797 g. Find (a) the average mass, (b) the absolute error in the average mass, (c) the % relative error in the average mass, (d) the average deviation, (e) the % relative average deviation, (f) the standard deviation, and (g) the coefficient of variation. Trial 1 2 3 4 5 Sums Average xi 3.088 3.063 2.477 2.518 3.114 14.26 2.852 2 di di =|xi -xav| =(xi -xav) 0.236 0.211 0.375 0.334 0.262 1.418 0.2836 0.055696 0.044521 0.140625 0.111556 0.068644 0.421042 2 (a) the average: xav = xi/n = 14.260/5=2.852 (b) the absolute error in the average: xtr - xav = 2.852-2.797=0.055 (c) the % relative error in the average: xtr – xav/xtr x 100 = 0.055/2.797x100= 1.8 % (d) the average deviation: dav =di/n =1.418/.5= 0.2836 (e) the % relative average deviation: dav/xav x 100 =0.2836/2.852 x 100 = 9.9 % (f) the standard deviation: s = [di2/(n-1)]]1/2= [0.4210/(n-1)]1/2 = 0.3240 (g) the coefficient of variation: v = (s/xav) x 100 = 0.324/2.852 x 100 =11.36 (i.e., the % relative average deviation = 11.36%) LAB EXERCISE (with a QuattroPro Spreadsheet Program ) Open the QuattroPro notebook named “CHEM1421.WB2” from the local A Drive with your Diskette in it. This can be done with the following steps. Selections to be clicked on are underlined. (1) Click (Double) on My Computer from the Windows Desktop. (2) Click on the A Drive. (2’) This is an alternative in case your diskette is defective. Click on the network M Drive (the Notebook Drive). Select the folder path as 3 follows: Science Mkim C142xDISK (3) Click on CHEM1421.WB2 a. The first page (sheet) states the problem (Tab label is Stat Problem). b. Go to the next page for Work Out(tab label) by clicking the tab. c. Fill out the blank cells in the columns properly first before answering any questions. Your instructor may guide you. d. After the spreadsheet is properly is completed, give answers by replacing the question marks(?) with appropriate formula for each cell. Note: (1) An entry of formula into a cell precedes with + or = sign. (2) All functions precede with @ sign. For an example, +@ SQRT(E15) takes square root of a content in a cell E15 and places the value in the current cell. Note: Why are the % relative deviations, being about 10% (9.9% and 11.36%), so large in these measurements? This is because two different types of pennies are mixed in the sample above. The pure copper penny has an average mass of 3.08 g and the coppercoated zinc penny has an average mass of 2.52 g. You may try to find an average and % relative deviations for a sample of five pennies that contains only one type of penny (pure copper): 3.088, 3.063, 3.114, 3.062, 3.097 g. You will see the % relative deviations being less than 1% regardless whether it is given in terms of an average deviation or a standard deviation. Systematic Error vs. Random Error Experimental data are subject to two types of errors, systematic (determinate) errors and random (indeterminate) errors. Systematic Error : This type of error has a definite cause, and uni-directional (being always positive or always negative), and can be corrected by careful work and calibration of equipment. These typically reduce the accuracy of an experiment without necessarily lowering the precision of a results. An example of this is a tire pressure guage which consistently reads 5 psi high. Random Error : This type of error arises from inherent limits of any measuring devices, and is bidirectional (being randomly positive or negative), and cannot be corrected. For example, the type of analytical balance used in our laboratory cannot weigh any object more accurately than 0.003g ( 0.3mg). Since the systematic errors can always be removed from well-designed and careful measurements, it is the random errors that one must deal with for an analysis of experimental data. Details of handling random errors in scientifics will be given in Part II, Section C, Propagation of Random Errors. 4 Section B. Scientific Graphing with Spreadsheet Software (QuattroPro) In this exercise, you will build a spreadsheet and a graph by trying to reproduce a sample of a spreadsheet and a graph given in Density_Sample (tab label) page (from the CHEM1421.WB2 notebook). Refer to the Figure below. You will be writing all of them by yourself. Open the empty page labelled Density_Workout. Type in the titles and the density of water vs. temperature data (Appendix Table B1) into the Quattro Pro spreadsheet. You might use the cells B5 through B35 for temperature and the cells D5 through D35 for the density values or any other cells into any two columns. After you complete the two columns for X- and Y-axis, you can build the graph based on the values in the column. 5 How To Graph with Quattro Pro 6.0 9/10/99 The selections to be clicked on are underlined. You type in the italicized parts. (1) Click on Graphics. (2) Select New Graph, (3) Type in Graph Name, or you may simply accept a pre-given name. Type in the following (or other appropriate) cell addresses for the variables. X-axis, B8..B35 1-st, D8..D35 You may leave the Legend blank. Click OK. You will get a graph with a curve. In case you are getting a display of numbers, a pie graph, a bar graph or something else at this point, then click on the Graphics again and select Types. Bullet-mark 2-D, then select a graph-type at the second column and second raw, i.e., the graph with two crossing lines. Click OK. (4) In order to add titles, click on Graphics, and then select Title. Give appropriate titles. Main Title: Density of Water Subtitle: at Various Temperature X-Axis: Temperature (degree Celsius) Y-Axis: Density (g/mL) Click OK. (6) Click on File, and select Print for a hard copy of the chart. (7) In order to bring and embed the Chart to the same spreadsheet you just prepared, highlight the Chart by clicking on it. Then click on Edit, and on Copy. (8) Go back to the spreadsheet by clicking on the Window in the menu And by selecting the current spreadsheet notebook (CHEM1421.WB2). Select a location for the chart by clicking on any appropriate cell (e.g., F7) from where the chart can be started. Paste the chart from the Edit menu. De-select the chart by clicking on any area in the spreadsheet outside the chart area. (9) Type in your name some place on the spreadsheet for an identification. (10) Print if a hard copy of the spreadsheet with the graph is desired. (11) Save, and you may Exit. (12) Attach the Chart using a glue at the end of your Lab Record book. Workout the Vapor Pressure Sheet and graph on the next page of the QuattroPro Notebook by following similar steps given above. This is homework. You should attach the completed Vapor Pressure Graph at the end of your Lab Record Book with glue. 6 Regression Analysis with the Least Squares Method This method will be briefly introduced later and used in the Beer’s Law Lab (Absorption Spectroscopy) in CHEM 1421 LAB, but will be used extensively in CHEM 1422 LAB. Details of the instructions and examples of the least squares analysis with QuattroPro for Windows (QPW) can be found in Part II. Intermediate Statistics and elsewhere. They are in the Distribution Diskette and on the Course Website, and also in the Course Folder in the network M (Notebook) Drive. QPRO1421 file illustrates the least squares analysis with an example of the Beer’s Law Experiment from CHEM 1421 Lab. QPRO1422 file illustrates the least squares analysis with an example of the Vapor Pressure of Water Experiment from CHEM 1422 Lab. (end) (A:/e01_BSta.doc, 4/27/99, rev. 5/5/1999, M. H. Kim) 7