MATH 2441 Probability and Statistics for Biological Sciences Introduction Few of you just beginning the study of probability and statistics will challenge the statement that this branch of mathematics is apparently among the least-liked, most difficult, probably most useless mathematics courses that many non-mathematicians are "forced" to study. The sense of uselessness probably arises from several misconceptions. Many people must study statistics as part of an academic program, and because of unhappy experiences in such courses, have avoided jobs or tasks that could have exploited the material covered in that course -- hence the sense that this was not very useful knowledge. Also, we often hear reports of professional researchers claiming statistical support for two completely opposite conclusions with regard to the efficacy of a new medical treatment, or the degree of public support for a particular politician or proposed law. This makes it look like statistical methods are simply tools for manipulating data to support any desired conclusions. This makes statistical methods appear to be unreliable guides to drawing conclusions, and hence of little practical value to anyone other than rather unscrupulous promoters. Although statistics is one of the more recently developed branches of mathematics (the roots of some methods go back into the 1700's, but most statistical methods and ideas were developed during this century), it now spans a large number of methods and applications. A moderately comprehensive introductory textbook will contain 15 - 20 chapters, squeezed into 600 - 900 pages, each dense in formulas, tables of numbers and fine-printed text. In more advanced treatments, the topics of each of these chapters could expand to span several large volumes, as details of application are worked out. This makes statistics look like a vast collection of formulas and methods, quite beyond the capacity of all but a rather select (though strange) minority of people to master. So, maybe it's not surprising that most people who have encountered the study of statistics view it as a difficult and basically useless body of knowledge. Everything we do in this course is intended to convince you that this could well be the most important and useful material you study in your entire program and that the large number of apparently complicated techniques of statistics are based on applying just a few basic ideas and strategies to a variety circumstances of importance in virtually all areas of technology. The discipline of statistics provides strategies and methods for interpreting and drawing conclusions from data, particularly when we may be far from having complete information about a situation (and this is always the case in issues of important practical significance). By providing standard approaches to such problems, it helps us avoid drawing the wrong conclusion and so taking the wrong sort of actions. By providing such standard approaches to interpreting experimental data, it ensures that two researchers looking at the same data will either draw the same conclusions or will be able to understand why they came up with different conclusions. Perhaps it would be nice to leave statistics those few people who seem to like to do that sort of thing. Unfortunately (perhaps), this is no longer really possible for most areas of technology, and indeed, for many aspects of our lives in general. Advancing technologies, strong competition, and rapidly changing economies means that the ability to formulate questions, devise appropriate data collection experiments, and then interpret the data and draw reliable conclusions and formulate appropriate action plans are an important part of nearly every job today. The same skills are needed by every consumer trying to sort out many competing claims about products we buy daily. But … is statistics really that important in food technology and biotechnology? It may surprise you to learn that these two fields of technology are among those that make the greatest use of statistics (including methods of probability theory). A quick look through leading food technology journals reveals some of the most sophisticated applications of statistical methods being done. When biotechnology becomes involved in projects such as the development of new drugs, statistical studies to determine the efficacy of such medications can end up costing tens of millions of dollars. David W. Sabo (1999) 106751135 Page 1 of 2 In this introductory section of the course, we will do three things to begin the process of making statistics both comprehensible and relevant. First, we will need to establish some basic terminology and concepts related to the notions of random samples, populations, and so forth. A summary of these ideas is given in the document "What Is/Are Statistics". Secondly, a brief summary of where the course is going is given in the document "Look Ahead … Terminology and Directions". Whenever things in the course seem to be turning into mental mush, it might be useful to reread these two short documents to remind yourself of the point of it all. It may be that some of what you read in these two documents doesn't seem to make much sense at first. By the end of the course, you should be able to speak of these issues as familiar ideas. Finally, we will spend some class time doing a simple, but revealing experiment -- one designed to illustrate both the problems that arise when we try to study a population through sampling, and one that will point the way to possible solutions for those problems. Although the document "The Real Problem: An Initial Example" describes the details of the experiment as performed by other students, it is really important that you go through the experimental process yourself to see with your own eyes what happens. That is the best way to understand the difficulty of the problem statistics is trying to solve. Page 2 of 2 106751135 David W. Sabo (1999)