Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 2 The Language of Statistics 1 Doing Statistics for Business Chapter 2 Objectives Difference Between the Population & a Sample of the Population Difference Between a Parameter & a Statistic Factors that Influence Sample Size: Some Sampling & Sample Size Considerations 2 Doing Statistics for Business Chapter 2 Objectives (con’t) Selecting the Sample Types of Data The Difference Between Descriptive & Inferential Statistics Ethical Issues in Data Analysis 3 Doing Statistics for Business Chapter 2 Objectives (con’t) Communicating the Results Basic Summation Notation 4 Doing Statistics for Business The Population is everything you wish to study. 5 Doing Statistics for Business A Variable is used to represent a characteristic of each member of the population. 6 Doing Statistics for Business TRY IT NOW! The In-line Skate Company Identifying Possible Variables to Study There have been a number of failures on the braking device of a new model of roller blades that your company manufactures. What is the population of interest? Name two variables or characteristics that you might wish to study. 7 Doing Statistics for Business Figure 2.1 The Population and a Sample 8 Doing Statistics for Business A Census is a study of the entire population. 9 Doing Statistics for Business Sampling Error is the difference between a characteristic of the entire population and a sample of that population. The amount of Variation refers to how different the members of the population are from each other with regard to the variable being studied. 10 Doing Statistics for Business Discovery Exercise 2.1 Introduction to Sampling & Variability Suppose that each set of data in this exercise represents an entire population. Since we don’t yet have a way to quantify the amount of variability in a population, the data sets are labeled as having a small amount of variability or a large amount of variability. The first data set shows the number of people in 50 families living in a small college town in New England. The second set of data shows the number of people in 50 families living in a large city in the South. 11 Doing Statistics for Business DISCOVERY EXERCISE 2.1 New England Families: Large Amount of Variability Average number of people in 50 families: 4.32 0 4 5 7 8 3 9 8 8 8 4 9 9 0 6 4 0 3 9 7 8 2 3 1 9 1 7 5 0 1 0 6 8 2 9 4 0 1 0 3 4 2 4 9 4 1 3 8 0 0 12 Doing Statistics for Business DISCOVERY EXERCISE 2.1 Southern Families: Small Amount of Variability Average number of people in 50 families: 4.44 1 3 4 3 5 3 4 6 6 3 6 4 3 4 6 7 5 4 5 5 4 4 5 5 6 3 4 4 7 4 6 3 5 5 5 5 4 4 4 5 4 5 4 3 4 5 4 2 4 4 13 Doing Statistics for Business A Parameter is a number which describes a characteristic of the population. 14 Doing Statistics for Business A Statistic is a number that describes a characteristic of a sample. 15 Doing Statistics for Business The Size of the Population is the number of members of the population. It will be referred to as N. 16 Doing Statistics for Business Fig 2.2 Bigger & Bigger Samples 17 Doing Statistics for Business The Size of the Sample will be referred to as n. 18 Doing Statistics for Business A Biased Sample is a sample which does not fairly represent the population. 19 Doing Statistics for Business TRY IT NOW! Stress Relief Identifying Possible Biases in a Sample You are studying the methods that students at your school use to relieve stress. You decide to use your statistics class as your sample. Why might this be a biased sample? 20 Doing Statistics for Business A Simple Random Sample is a sample that has been selected in such a way that all members of the population have an equal chance of being picked. In addition, every sample of size n has the same chance of becoming our sample. 21 Doing Statistics for Business A Sampling Frame is a list of all members of the population. 22 Doing Statistics for Business A Table of Random Numbers is a table that consists of numbers randomly generated and listed in the order in which they were generated. 23 Doing Statistics for Business Row # 1 2 3 4 5 1 2 Column # 3 094632795 033413186 297556368 472960570 256883707 711501513 653475420 658953044 785645638 716249997 537971597 289063704 738968017 574817322 378236162 4 5 6 562758635 485441982 414437050 817883255 467694224 410398128 460744361 296126017 976076280 193707682 182794408 328703833 075254187 843373358 380141891 Figure 2.3 Portion of a Table of Random Numbers 24 Doing Statistics for Business TRY IT NOW! The Glue Company Selecting A Simple Random Sample Select a sample of 5 tubes of glue for the glue company. You can assume that each tube of glue has a 5-digit ID number, which the company uses to track its inventory. 25 Doing Statistics for Business Discovery Exercise 2.2 Introduction to Sampling Suppose the data shown below represent an entire population. They show the number of people in 50 families living in a small college town in New England. (If you did Discovery Exercise 2.1 then you will recognize this as the same data set.) Now a 2-digit ID number has also been included. 26 Doing Statistics for Business Discovery Exercise 2.2 New England Families: Large Amount of Variability Average number of people in 50 families: 4.32 ID: 01 0 ID: 02 4 ID: 03 5 ID: 04 7 ID: 05 8 ID: 06 3 ID: 07 9 ID: 08 8 ID:09 8 ID:10 8 ID: 11 4 ID: 12 9 ID: 13 9 ID: 14 0 ID: 15 6 ID: 16 4 ID:17 0 ID: 18 3 ID:19 9 ID: 20 7 ID: 21 8 ID:22 2 ID: 23 3 ID: 24 1 ID: 25 9 ID: 26 1 ID: 27 7 ID: 28 5 ID: 29 0 ID: 30 1 ID: 31 0 ID: 32 6 ID: 33 8 ID: 34 2 ID: 35 9 ID: 36 4 ID: 37 0 ID: 38 1 ID: 39 0 ID: 40 3 ID: 41 4 ID: 42 2 ID: 43 4 ID: 44 9 ID:45 4 ID: 46 1 ID: 47 3 ID: 48 8 ID: 49 0 ID: 50 0 27 Doing Statistics for Business Qualitative Data describe a particular characteristic of a sample item. They are most often non-numerical in nature. 28 Doing Statistics for Business TRY IT NOW! Dress Down Day Possible Qualitative Variables The company considering the dress down day tries this policy out with a sample of employees. After the policy has been in effect for some time, the company decides to measure the change in productivity for the employees. It appears that for some employees productivity has increased, while for others it has decreased. What qualitative data might have been collected to help the company understand the differences observed? 29 Doing Statistics for Business Data that are created by assigning numbers to different categories when the numbers have no real meaning are called Nominal Data. Data that are created by assigning numbers to categories where the order of assignment has meaning are called Ordinal Data. 30 Doing Statistics for Business Discrete Data are data that can take on only certain values. These values are often integers or whole numbers. Continuous Data are data that can take on any one of an infinite number of possible values over an interval on the number line. These values are most often the result of measurement. 31 Doing Statistics for Business Tools of Descriptive Statistics allow you to summarize the data. 32 Doing Statistics for Business An Inference is a deduction of a conclusion. 33 Doing Statistics for Business Fig 2.4 Pie Chart Fig 2.4 Bar Chart PIE CHART A 20% E 25% BAR CHART 30 25 20 D 10% 15 10 B 30% C 15% 5 0 A B C D E 34 Doing Statistics for Business Inferential Statistics Probability Figure 2.5 Relationship between probability and inferential statistics 35 Doing Statistics for Business The Techniques of Inferential Statistics allow us to draw inferences or conclusions about the population from the sample. 36 Doing Statistics for Business We will use Probability theory to calculate the likelihood of observing or selecting a particular sample from a population. 37 Doing Statistics for Business Sigma Notation is shorthand notation used to write formulas. It is so named because it uses the Greek capital letter sigma, written as . 38 Doing Statistics for Business TRY IT NOW! The Mail-Order Company Using Sigma Notation A mail-order company wants some information on the daily demand for a product that has been heavily advertised. The company looks at the orders for an 8-day period and obtains the following data: Demand 31 28 29 32 30 31 29 30 Use the sigma notation to write down the expression which means to add up all the data values. 39 Doing Statistics for Business TRY IT NOW! The Mail-Order Company Using Sigma Notation to Sum Differences A mail-order company wants some information on the daily demand for a product that has been heavily advertised. The company looks at the orders for an 8-day period and obtains the following data: Demand 31 28 29 32 30 31 29 30 Use the sigma notation to write down the expression that means to add up all the differences between the data values and the number 30. 40 Doing Statistics for Business Selecting a Sample in Excel 1. From the Tools menu, select Data Analysis. 2. Scroll down the Analysis Tools list and select Sampling. You must tell Excel three things to obtain the sample: (!) the location of the population, (2) the type of sampling method and the number of samples, and (3) where you want the sample placed. 3. Position the cursor in the box labeled Input Range and then highlight the range in the worksheet that contains the data. Since the first row of the population is a title, ID Codes, check the box marked Labels. 41 Doing Statistics for Business Selecting a Sample in Excel (con’t) 4. In the section labeled Sampling Method, click the radio button for the Random sampling methods and type “5” in the text box for Number of Samples:. 5. In the section labeled Output Options, you can specify that the sample be located in a section of the current worksheet, a new worksheet in the same workbook, or a new workbook. 6. Click OK; the random sample of 5 ID codes will appear in the location you specified. 42 Doing Statistics for Business Figure 2.7 Data Analysis Dialogue Box 43 Doing Statistics for Business Figure 2.8 Sampling Dialogue Box 44 Doing Statistics for Business Figure 2.9 Completed Sampling Dialogue Box 45 Doing Statistics for Business Figure 2.10 Random Sample of 5 ID Codes 46 Doing Statistics for Business Chapter 2 Summary In this chapter you have learned: The basic language of statistics. A Population is the complete group you wish to study. A Variable is a characteristic of each member of the population. 47 Doing Statistics for Business Chapter 2 Summary (con’t) There are two types of variables: Quantitative Qualitative A Sample is a piece of the population. Sampling Error is the difference between the behavior of the entire population and a sample of that population. 48 Doing Statistics for Business Chapter 2 Summary (con’t) It is important that a sample is a fair representation of the entire population or an Unbiased Sample. A Simple Random Sample is a sample that has been selected in such a way that all members of the population have an equal chance of being picked. You can select a select a Simple Random Sample by using a Table of Random Numbers or Excel. 49 Doing Statistics for Business Chapter 2 Summary (con’t) Sigma Notation is the shorthand notation used to write equations throughout the remainder of the book. 50