Statistics 1 2 Introductory Statistics Chapter 1 Introduction to Statistics Chapter 2 Describing Data Sets Chapter 3 Using Statistics to Summarize Data Sets Chapter 4 Probability Chapter 5 Discrete Random Variables Chapter 6 Normal Random Variables Chapter 7 Distributions of Sampling Statistics Chapter 8 Estimation Chapter 9 Testing Statistical Hypotheses Chapter 10 Hypothesis Tests Concerning Two Populations Chapter 11 Analysis of Variance Chapter 12 Linear Regression Chapter 13 Chi-Squared Goodness-of-Fit Tests Chapter 14 Nonparametric Hypotheses Tests 3 Some Special Features of the Text Introduction Statistics in Perspective (觀點) Real Data ◦ Throughout the text discussions, examples, perspective highlights, and problems, real data sets are used to enhance the students’ understanding of the material. ◦ These data sets provide information for the study of current issues in a variety of disciplines, such as health, medicine, sports, business, and education. Historical Perspectives Problems/Review Problems Summary/Key Terms Formula Summary Program CD-ROM 4 Chapter 1 Introduction to Statistics 1.1 Introduction 1.2 The Nature of Statistics 1.3 Populations and Samples 1.4 A Brief History of Statistics ◦ This chapter introduces the subject matter of statistics, the art (技術) of learning from data. ◦ It describes the two branches of statistics, descriptive (描述) and inferential (推理). ◦ The idea of learning about a population (母群) by sampling and studying certain of its members is discussed. 5 Introduction Is it better for children to start school at a younger or older age? ◦ Achievement tests ◦ The total number of years spent in school (Table 1.1) 6 Introduction Conclusions: ◦ Using the census (記錄) data, the age at which a child enters school has very little effect on the total number of years that a child spends in school. ◦ One must collect relevant information (data), and these data must then be described and analyzed. ◦ Such is the subject matter of statistics. 7 The Nature of Statistic Definition (Statistics) ◦ Statistics (統計學) is the art of learning from data. ◦ Statistics is concerned with the collection of data, their description, and their analysis, which often leads to the drawing of conclusions. 8 Data Collection Definition (descriptive statistics) ◦ The part of statistics concerned with the description and summarization of data is called descriptive statistics (描述統計). For example: ◦ The efficacy of a new drug needs to be determined Divide the volunteers into two groups by “random” one group receives the drug, the other group receives a placebo (安慰劑) Control group: The group that does not receive any treatment (that is, the volunteers that receive a placebo). 9 Inferential Statistics and Probability Models Definition (inferential statistics) ◦ The part of statistics concerned with the drawing of conclusions from data is called inferential statistics (推論 統計). ◦ When the experiment is completed and the data are described and summarized, we hope to be able to draw a conclusion about the efficacy of the drug. It is usually necessary to make some assumptions about the chances (or probabilities) of obtaining the different data values. The totality of these assumptions is referred to as a probability model for the data. 10 Inferential Statistics and Probability Models Conclusions ◦ The basis of statistical inference is the formulation of a probability model to describe the data. ◦ An understanding of statistical inference requires some knowledge of the theory of probability. ◦ Statistical inference starts with the assumption that important aspects of the phenomenon(現象) under study can be described in terms of probabilities, and then it draws conclusions by using data to make inferences about these probabilities. 11 Populations and Samples Definition ◦ The total collection of all the elements that we are interested in is called a population (母群). ◦ A subgroup of the population that will be studied in detail is called a sample (樣本). A given sample generally cannot be considered to be representative of a population unless that sample has been chosen in a random manner. This is because any specific nonrandom rule for selecting a sample often results in one that is inherently biased (偏見) toward some data values as opposed to (與...對照) others. 12 Populations and Samples Definition ◦ A sample of k members of a population is said to be a random sample, sometimes called a simple random sample, if the members are chosen in such a way that all possible choices of the k members are equally likely. 13 A Brief History of Statistics A systematic collection of data on the population and the economy was begun in the Italian city-states of Venice (威尼斯) and Florence ( 佛羅倫斯) during the Renaissance. (Renaissance:文藝復興時期,從14世紀末期到大約1600年之間) The term statistics, derived from the word state, was used to refer to a collection of facts of interest to the state. In 1662 the English tradesman John Graunt published a book entitled Natural and Political Observations Made upon the Bills of Mortality (死亡率清單). Table 1.2, which notes the total number of deaths in England and the number due to the plague (瘟疫) for five different plague years, is taken from this book. (John Graunt 1662) 14 A Brief History of Statistics Graunt used the London bills of mortality (死亡率清單) to estimate the city’s population. To estimate the population of London in 1660, Graunt surveyed households (家庭) in certain London parishes (地方行政區) and discovered that, on average, there were approximately 3 deaths for every 88 people. There was roughly 1 death for every 88/3 people. Since the London bills cited 13,200 deaths in London for that year, Graunt estimated the London population to be about 13,200 X 88 / 3 = 387,200 15 A Brief History of Statistics Graunt also used the London bills of mortality to infer ages at death. 16 A Brief History of Statistics In the early 20th century, two of the most important areas of applied statistics were population biology (生物學) and agriculture (農耕). Nowadays the ideas of statistics are everywhere. ◦ Descriptive statistics are featured in every newspaper and magazine. ◦ Statistical inference has become indispensable (必需的) to public health and medical research, to marketing (行銷) and quality control (品管), to education, to accounting(會計), to economics(經濟), to meteorological forecasting(氣象預報), to polling (投票) and surveys (調查), to sports, to insurance(保險), to gambling(賭博), and to all research that makes any claim to being scientific. 17 KEY TERMS Statistics: The art of learning from data. Descriptive statistics: The part of statistics that deals with the description and summarization of data. Inferential statistics: The part of statistics that is concerned with drawing conclusions from data. Probability model: The mathematical assumptions relating to the likelihood of different data values. Population: A collection of elements of interest. Sample: A subgroup of the population that is to be studied. Random sample of size k: A sample chosen in such a manner that all subgroups of size k are equally likely to be selected. 18