Chapter 01 What is statistics? Dr. Halil İbrahim CEBECİ Statistics Lecture Notes What is statistics? Simple Explanation: Statistics is a way to get information from data Statistics is the study of the collection, organization, analysis, interpretation, and presentation of data What do statisticians do? Statisticians contribute to scientific enquiry by applying their knowledge to the design of surveys and experiments; the collection, processing, and analysis of data; and the interpretation of the results Statistics Lecture Notes – Chapter 01 What is statistics? Statistics Data Data: Facts, especially numerical facts, collected together for reference or information. 55,68,39,43 Information Information: Knowledge communicated concerning some particular fact. Class average, Most frequent mark, Marks distribution, etc. Statistics is a tool for creating new understanding from a set of numbers. Statistics Lecture Notes – Chapter 01 What is statistics? Why? Because numeric everywhere and non-numeric In marketing, accounting, finance, politics, sciences, and elsewhere, statistics data are economics, there are We need to be able to understand statistics when we encounter them We need to not be tricked by misleading statistics We need to use statistics to help us make decisions under future uncertainty Statistics Lecture Notes – Chapter 01 What is statistics? Where? Research analysts for Merrill Lynch evaluate many facets of a particular stock before making a “buy” or “sell” recommendation. The marketing department at Colgate-Palmolive Co., a manufacturer of soap products, has the responsibility of making recommendations regarding the potential profitability of a newly developed group of face soaps having fruit smells. The United States government is concerned with the present condition of our economy and with predicting future economic trends. Managers must make decisions about the quality of their product or service. Statistics Lecture Notes – Chapter 01 Key statistical concepts Descriptive statistics Collecting (eg.Survey, Experiment and Observation), presenting(eg. Bar Charts and Graphs), and describing data (eg. Mean) Inferential statistics Drawing conclusions and/or making decisions concerning a population based only on sample data Statistics Lecture Notes – Chapter 01 Key statistical concepts Population a population is the group of all items of interest to a statistics practitioner. frequently very large; sometimes infinite. E.g. All 5 million Florida voters, per Example 12.5 Sample A sample is a set of data drawn from the population. Potentially very large, but less than the population. E.g. a sample of 765 voters exit polled on election day. Statistics Lecture Notes – Chapter 01 Key statistical concepts Population Sample Subset Parameter (A descriptive measure of a population) Statistics Lecture Notes – Chapter 01 Statistic (A descriptive measure of a sample) Descriptive Statistics According to Consumer Reports, General Electric washing machine owners reported 9 problems per 100 machines during 2001. The statistic 9 describes the number of problems out of every 100 machines. Statistics Lecture Notes – Chapter 01 Descriptive Statistics are methods of organizing, summarizing, and presenting data in a convenient and informative way. These methods include: Graphical Techniques and, Numerical Techniques The actual method used depends on what information we would like to extract. Are we interested in… measure(s) of central location? and/or measure(s) of variability (dispersion)? Statistics Lecture Notes – Chapter 01 Statistical Inference Descriptive Statistics describe the data set that’s being analyzed, but doesn’t allow us to draw any conclusions or make any interferences about the data. Hence we need another branch of statistics: inferential statistics. Inferential statistics is also a set of methods, but it is used to draw conclusions or inferences about characteristics of populations based on data from a sample. Statistics Lecture Notes – Chapter 01 Statistical Inference Statistical inference is the process of making an estimate, prediction, or decision about a population based on a sample. What can we infer about a Population’s Parameters based on a Sample’s Statistics? Statistics Lecture Notes – Chapter 01 Statistical Inference Rationale: Large populations make investigating each member impractical and expensive. Easier and cheaper to take a sample and make estimates about the population from the sample. However: Such conclusions and estimates are not always going to be correct. For this reason, we build into the statistical inference “measures of reliability”, namely confidence level and significance level. Statistics Lecture Notes – Chapter 01 Examples Ex1.1 - According to USAToday (15 October 1987), the average size of an American household had fallen from 3.14 persons in 1970 to 2.66 persons in 1987. a. The 1987 figure of 2.66 is claimed to be the value of a population parameter. What are the population and the parameter? b. What procedure must be taken to be 100% certain that the value of the population parameter is exactly 2.66? c. What procedure was likely used to arrive at the 1987 figure of 2.66? Use the terms sample, sample statistic, and inference in your answer. Statistics Lecture Notes – Chapter 01 Examples A1.1a - One can imagine determining the number of persons living in each and every household in the United States. The set of all these (millions of) numbers is the population of interest. The average of this population of numbers is the parameter of interest, which is claimed to be 2.66. A1.1b - You would have to collect all the numbers in the population (called taking a census) and then compute the average of all the numbers. Statistics Lecture Notes – Chapter 01 Examples A1.1c - It is likely that only a relatively small subset of all American households was selected and the number of persons living in each of these households obtained. The set of numbers obtained for the selected group of households is a sample drawn from the population. The average of the sample values, called the sample statistic, was then computed to be 2.66. The statement that 2.66 is the average size of all American households is an inference about the population parameter; it may or may not be correct. Statistics Lecture Notes – Chapter 01 Exercises Q1.1 - Thousands of customers have accounts at a large department store. An accountant claims that the average unpaid balance for these accounts is $75, a figure obtained by computing the average of the unpaid balances for 50 of the accounts. a. Identify the population and its parameter. b. What is the sample? c. Is the figure of $75 a parameter or a statistic? Statistics Lecture Notes – Chapter 01 Exercises Q1.2 - A psychologist has interviewed 250 school children throughout New York State and found that 80% of them spend at least 25 hours a week watching television. a. Identify the population parameter and the sample statistic of interest here b. Comment on the following inference, which is based on the results of the psychologist’s interviews: 80 percent of American school children spend at least 25 hours a week watching television. Statistics Lecture Notes – Chapter 01 References Keller, Gerald; Statistics Economics, 9e, 2012 for Management and Groebner, D.F.; Shannon, P.W., Fry, P.C, Smith, K.D; Business Statistics: A decision Making Approach, 7e, 2007 McClave, J.T, Benson, P.G, Sincich, T.; Statistics for Business and Economics, 11e, 2011 Statistics Lecture Notes – Chapter 01